[PATCH v2 1/2] PCI/ASPM: Override the ASPM and Clock PM states set by BIOS for devicetree platforms

Jon Hunter jonathanh at nvidia.com
Thu Feb 26 02:34:18 PST 2026


Hi Mani, Bjorn,

On 19/02/2026 17:42, Jon Hunter wrote:
> Hi Mani,
> 
> On 16/02/2026 14:35, Jon Hunter wrote:
> 
> ...
> 
>>> Krishna posted the series a couple of weeks before but forgot to CC you:
>>> https://lore.kernel.org/linux-pci/20260128-d3cold-v1-0- 
>>> dd8f3f0ce824 at oss.qualcomm.com/
>>>
>>> You are expected to use the helper pci_host_common_can_enter_d3cold() 
>>> in the
>>> suspend path.
> 
> 
> I have been playing around with this, but so far I have not got anything
> to work. Right now I have just made the following change (note that this
> is based upon Manikanta's fixes series [0]) ...
> 
> diff --git a/drivers/pci/controller/dwc/pcie-tegra194.c b/drivers/pci/ 
> controller/dwc/pcie-tegra194.c
> index 9883d14f7f97..9f88e4c1db08 100644
> --- a/drivers/pci/controller/dwc/pcie-tegra194.c
> +++ b/drivers/pci/controller/dwc/pcie-tegra194.c
> @@ -2311,6 +2311,7 @@ static int tegra_pcie_dw_suspend_late(struct 
> device *dev)
>   static int tegra_pcie_dw_suspend_noirq(struct device *dev)
>   {
>          struct tegra_pcie_dw *pcie = dev_get_drvdata(dev);
> +       struct dw_pcie *pci = &pcie->pci;
> 
>          if (pcie->of_data->mode == DW_PCIE_EP_TYPE)
>                  return 0;
> @@ -2318,6 +2319,9 @@ static int tegra_pcie_dw_suspend_noirq(struct 
> device *dev)
>          if (!pcie->link_state)
>                  return 0;
> 
> +       if (!pci_host_common_can_enter_d3cold(pci->pp.bridge))
> +               return 0;
> +
>          tegra_pcie_dw_pme_turnoff(pcie);
>          tegra_pcie_unconfig_controller(pcie);
> 
> 
> At first I was thinking that is we are not actually suspending the
> controller we can skip the configuration of the controller in the
> resume. However, if we skip configuring the controller in the resume
> then the device does not resume at all. So right now I have the
> above, but clearly this is not sufficient. The device resumes but
> the NVMe is not working ...
> 
>   nvme nvme0: ctrl state 1 is not RESETTING
>   nvme nvme0: Disabling device after reset failure: -19
>   nvme nvme0: Ignoring bogus Namespace Identifiers
>   Aborting journal on device nvme0n1p1-8.
>   nvme0n1: detected capacity change from 0 to 976773168
>   EXT4-fs error (device nvme0n1p1): __ext4_find_entry:1613: inode 
> #18622533: comm (t-helper): reading directory lblock 0
>   Buffer I/O error on dev nvme0n1p1, logical block 60850176, lost sync 
> page write
>   Buffer I/O error on dev nvme0n1p1, logical block 0, lost sync page write
>   JBD2: I/O error when updating journal superblock for nvme0n1p1-8.
>   EXT4-fs (nvme0n1p1): I/O error while writing superblock
>   EXT4-fs error (device nvme0n1p1): ext4_journal_check_start:86: comm 
> rs:main Q:Reg: Detected aborted journal
>   Buffer I/O error on dev nvme0n1p1, logical block 0, lost sync page write
>   EXT4-fs (nvme0n1p1): I/O error while writing superblock
>   EXT4-fs (nvme0n1p1): Remounting filesystem read-only
>   EXT4-fs (nvme0n1p1): shut down requested (2)
> 
> Is the above what you were thinking? Anything else I am missing?

So NVMe is still broken for us and I admit, I don't fully understand the 
issue. However, it seems to me that this change is not working for all 
device-tree platforms as intended. So for now, would it be acceptable to 
add a callback function for drivers such as the Tegra194 PCIe driver to 
opt out of this? This would at least allow NVMe to work as it was before.

Thanks
Jon

-- 
nvpublic




More information about the Linux-nvme mailing list