[PATCH v2 1/2] PCI/ASPM: Override the ASPM and Clock PM states set by BIOS for devicetree platforms
Jon Hunter
jonathanh at nvidia.com
Thu Feb 26 02:34:18 PST 2026
Hi Mani, Bjorn,
On 19/02/2026 17:42, Jon Hunter wrote:
> Hi Mani,
>
> On 16/02/2026 14:35, Jon Hunter wrote:
>
> ...
>
>>> Krishna posted the series a couple of weeks before but forgot to CC you:
>>> https://lore.kernel.org/linux-pci/20260128-d3cold-v1-0-
>>> dd8f3f0ce824 at oss.qualcomm.com/
>>>
>>> You are expected to use the helper pci_host_common_can_enter_d3cold()
>>> in the
>>> suspend path.
>
>
> I have been playing around with this, but so far I have not got anything
> to work. Right now I have just made the following change (note that this
> is based upon Manikanta's fixes series [0]) ...
>
> diff --git a/drivers/pci/controller/dwc/pcie-tegra194.c b/drivers/pci/
> controller/dwc/pcie-tegra194.c
> index 9883d14f7f97..9f88e4c1db08 100644
> --- a/drivers/pci/controller/dwc/pcie-tegra194.c
> +++ b/drivers/pci/controller/dwc/pcie-tegra194.c
> @@ -2311,6 +2311,7 @@ static int tegra_pcie_dw_suspend_late(struct
> device *dev)
> static int tegra_pcie_dw_suspend_noirq(struct device *dev)
> {
> struct tegra_pcie_dw *pcie = dev_get_drvdata(dev);
> + struct dw_pcie *pci = &pcie->pci;
>
> if (pcie->of_data->mode == DW_PCIE_EP_TYPE)
> return 0;
> @@ -2318,6 +2319,9 @@ static int tegra_pcie_dw_suspend_noirq(struct
> device *dev)
> if (!pcie->link_state)
> return 0;
>
> + if (!pci_host_common_can_enter_d3cold(pci->pp.bridge))
> + return 0;
> +
> tegra_pcie_dw_pme_turnoff(pcie);
> tegra_pcie_unconfig_controller(pcie);
>
>
> At first I was thinking that is we are not actually suspending the
> controller we can skip the configuration of the controller in the
> resume. However, if we skip configuring the controller in the resume
> then the device does not resume at all. So right now I have the
> above, but clearly this is not sufficient. The device resumes but
> the NVMe is not working ...
>
> nvme nvme0: ctrl state 1 is not RESETTING
> nvme nvme0: Disabling device after reset failure: -19
> nvme nvme0: Ignoring bogus Namespace Identifiers
> Aborting journal on device nvme0n1p1-8.
> nvme0n1: detected capacity change from 0 to 976773168
> EXT4-fs error (device nvme0n1p1): __ext4_find_entry:1613: inode
> #18622533: comm (t-helper): reading directory lblock 0
> Buffer I/O error on dev nvme0n1p1, logical block 60850176, lost sync
> page write
> Buffer I/O error on dev nvme0n1p1, logical block 0, lost sync page write
> JBD2: I/O error when updating journal superblock for nvme0n1p1-8.
> EXT4-fs (nvme0n1p1): I/O error while writing superblock
> EXT4-fs error (device nvme0n1p1): ext4_journal_check_start:86: comm
> rs:main Q:Reg: Detected aborted journal
> Buffer I/O error on dev nvme0n1p1, logical block 0, lost sync page write
> EXT4-fs (nvme0n1p1): I/O error while writing superblock
> EXT4-fs (nvme0n1p1): Remounting filesystem read-only
> EXT4-fs (nvme0n1p1): shut down requested (2)
>
> Is the above what you were thinking? Anything else I am missing?
So NVMe is still broken for us and I admit, I don't fully understand the
issue. However, it seems to me that this change is not working for all
device-tree platforms as intended. So for now, would it be acceptable to
add a callback function for drivers such as the Tegra194 PCIe driver to
opt out of this? This would at least allow NVMe to work as it was before.
Thanks
Jon
--
nvpublic
More information about the Linux-nvme
mailing list