[PATCH] nvme-pci: set some AMD PCIe downstream storage device to D3 for s2idle

Limonciello, Mario Mario.Limonciello at amd.com
Wed May 26 10:42:02 PDT 2021


[Public]

[resend without AMD official use tags; sorry my email client likes to change this constantly]

> -----Original Message-----
> From: Rafael J. Wysocki <rjw at rjwysocki.net>
> Sent: Wednesday, May 26, 2021 12:28
> To: Hans de Goede <hdegoede at redhat.com>; Keith Busch
> <kbusch at kernel.org>; Limonciello, Mario <Mario.Limonciello at amd.com>
> Cc: Christoph Hellwig <hch at lst.de>; Deucher, Alexander
> <Alexander.Deucher at amd.com>; Liang, Prike <Prike.Liang at amd.com>;
> axboe at fb.com; sagi at grimberg.me; linux-nvme at lists.infradead.org; S-k,
> Shyam-sundar <Shyam-sundar.S-k at amd.com>
> Subject: Re: [PATCH] nvme-pci: set some AMD PCIe downstream storage device
> to D3 for s2idle
> 
> On Wednesday, May 26, 2021 7:02:08 PM CEST Limonciello, Mario wrote:
> > [Public]
> >
> >
> > > >
> > > > For context, here's the summary from my understanding:
> > > >
> > > > We (linux-nvme) received a bug report that a platform fails to resume
> > > > after idle suspend due to mismatched behavior with the nvme driver.
> > > >
> > > > When suspending, the nvme driver checks pm_suspend_via_firmware(). If
> > > > false, the driver assumes platform firmware will not alter our device's
> > > > power state after the kernel completes its suspend process.
> > > >
> > > > But this platform's SMU firmware will remove power from the device.
> > >
> > > How exactly does it do that?
> >
> > It's running as a result of a platform driver notifying it to run (amd-pmc).
> 
> I guess this happens in one of the amd-pmc driver's system-wide suspend
> callbacks.  Which one?

IIRC it should be caused by:
.suspend_noirq

> 
> > >
> > > In particular, how does it get a chance to run?
> > >
> > > > Since the driver believed that wouldn't happen, the driver did not
> > > > prepare the device for this powerloss event.
> > > >
> > > > It seems that the kernel's assumptions around pm_suspend_via_firmware()
> > > > and pm_suspend_no_platform() may not accurately reflect what the
> > > > platform's firmware actually does.
> > >
> > > Note that this is not about whether or not AML will remove power from
> devices.
> > >
> > > It is about passing control entirely to the platform firmware at the end of the
> > > suspend transition.
> > >
> > > If instead the kernel executes AML that happens to remove power from
> some
> > > devices, that is a totally different case which should not be confused with
> > > the above.
> > >
> > > > I do not know of a better way to detect if the platform will remove power,
> > > > so I'm looking at quirks to suppress PM_SUSPEND_FLAG_NO_PLATFORM
> for
> > > > this platform. I'm hoping there's a better option, though :)
> > >
> > > Honestly, I'm not sure about the clear understanding of what's really going
> on
> > > here.
> > >
> > >
> >
> > We'll discuss internally and come back with a different proposal.
> > Thanks all for your feedback.
> >
> 
> OK
> 
> 


More information about the Linux-nvme mailing list