[PATCH] nvme-pci: set some AMD PCIe downstream storage device to D3 for s2idle
Keith Busch
kbusch at kernel.org
Tue May 25 13:24:21 PDT 2021
On Tue, May 25, 2021 at 08:09:02PM +0000, Limonciello, Mario wrote:
> [Public]
>
> > On Tue, May 25, 2021 at 02:06:09PM +0000, Limonciello, Mario wrote:
> > > "Then the NVMe device will be shutdown by SMU firmware in the s2idle
> > entry
> > > and then will lost the NVMe power context during s2idle resume. Finally,
> > > the NVMe command queue request will be processed abnormally and
> > result
> > > in access timeout"
> >
> > The nvme driver explicitly checks pm_set_suspend_via_firmware() in order
> > to know if firmware may manipulate our device after completing the idle
> > suspend. That is returning false here, yet firmware will do something
> > anyway.
>
> pm_set_suspend_via_firmware is not set during s2idle - from drivers/acpi/sleep.c
> it means ACPI S3 or ACPI S4 and thus pm_suspend_via_firmware however would
> not be used.
>
> Overloading this definition on these AMD platforms to solve this NVME problem
> would have unintended consequences. Just glancing through the kernel I notice
> the following drivers make use of that for decisions, which I would suspect to be
> problematic:
>
> * cros_ec/gsmi (on any AMD chromebook, EC might receive wrong event and logging wrong)
> * tpm
> * i8042
> * amdgpu (would break DPM_FLAG_SMART_SUSPEND)
Would it be less problematic if we check pm_suspend_no_platform()
instead? According to the kernel-doc, that returns 'true' when
firmware will not touch our device's power state, so it should return
'false' in order to be accurate here.
Note, this is always set for PM_SUSPEND_TO_IDLE, so we'd still need a
quirk to override it on this platform, but maybe this check doesn't
have the same clashes for you like pm_suspend_via_firmware?
More information about the Linux-nvme
mailing list