[PATCH] nvme-pci: set some AMD PCIe downstream storage device to D3 for s2idle

Christoph Hellwig hch at lst.de
Tue May 25 07:16:31 PDT 2021


On Tue, May 25, 2021 at 02:06:09PM +0000, Limonciello, Mario wrote:
> Quoting an earlier version commit message:
> 
> "Then the NVMe device will be shutdown by SMU firmware in the s2idle entry
> and then will lost the NVMe power context during s2idle resume. Finally,
> the NVMe command queue request will be processed abnormally and result
> in access timeout"

Where shutdown means power is cut off, right?  NVMe also has the concept
of shutting down the controller using a sequence of register writes and
reads, but if the SMU firmware messes with that we'd have deeper problems
than this.

> Which I think begs the question - how about if we keep the quirks list and logic
> outside of NVME and also outside of PCI but instead in an AMD owned platform
> driver?  Something like this:

I think what we're all missing here is that the concept of requring devices
to go to D3 for suspend to idle is a higher level concept.  AFAIK this
comes from this microsoft document:

https://docs.microsoft.com/en-us/windows-hardware/design/component-guidelines/power-management-for-storage-hardware-devices-intro

and spread from there.  Note that this document explicitly mentions AHCI
in addition to NVMe.  It also has some issues that I can spot:

 - PCIe slots are not specific to storage device, so this really needs to
   apply to all devices
 - it generall is a rather bad idea to start with as each shutdown not
   only causes media progam/erase cycles, but also is not very power
   efficient.

So what we need is a way for a driver to figure out if for a given
device it should shut down the device fully or just do something that
is efficient for saving as much as possible power.  That can be either
in form of a flag or by splitting the suspend method in different ones
for different use cases.  Platform-specific code (right now for Intel
and AMD) can then make sure drivers do get the right requests instead of
hardcoding platform information in every driver that wants to be able
to implement intelligent suspend behavior.



More information about the Linux-nvme mailing list