[PATCH v3 0/3] Asynchronous shutdown interface and example implementation

Tanjore Suresh tansuresh at google.com
Tue May 17 15:08:13 PDT 2022


Problem:

Some of our machines are configured with  many NVMe devices and
are validated for strict shutdown time requirements. Each NVMe
device plugged into the system, typicaly takes about 4.5 secs
to shutdown. A system with 16 such NVMe devices will takes
approximately 80 secs to shutdown and go through reboot.

The current shutdown APIs as defined at bus level is defined to be
synchronous. Therefore, more devices are in the system the greater
the time it takes to shutdown. This shutdown time significantly
contributes the machine reboot time.

Solution:

This patch set proposes an asynchronous shutdown interface at bus level,
modifies the core driver, device shutdown routine to exploit the
new interface while maintaining backward compatibility with synchronous
implementation already existing (Patch 1 of 3) and exploits new interface
to enable all PCI-E based devices to use asynchronous interface semantics
if necessary (Patch 2 of 3). The implementation at PCI-E level also works
in a backward compatible way, to allow exiting device implementation
to work with current synchronous semantics. Only show cases an example
implementation for NVMe device to exploit this asynchronous shutdown
interface. (Patch 3 of 3).

Changelog:

v2: - Replaced the shutdown_pre & shutdown_post entry point names with the
      recommended names (async_shutdown_start and asynch_shutdown_end).

    - Comment about ordering requirements between bridge shutdown versus
      leaf/endpoint shutdown was agreed to be different when calling
      async_shutdown_start and async_shutdown_end. Now this implements the
      same order of calling both start and end entry points.

v3: - This notes clarifies why power management framework was not
      considered for implementing this shutdown optimization.
      There is no code change done. This change notes clarfies
      the reasoning only.

      This patch is only for shutdown of the system. The shutdown
      entry points are traditionally have different requirement
      where all devices are brought to a quiescent state and then
      system power may be removed (power down request scenarios)
      and also the same entry point is used to shutdown all devices
      and re-initialized and restarted (soft shutdown/reboot
      scenarios).

      Whereas, the device power management (dpm)  allows the device
      to bring down any device configured in the system that may be
      idle to various low power states that the device may support
      in a selective manner and based on transitions that device
      implementation allows. The power state transitions initiated
      by the system can be achieved using 'dpm' interfaces already
      specified.

      Therefore the request to use the 'dpm' interface to achieve
      this shutdown optimization is not the right approach as the
      suggested interface is meant to solve an orthogonal requirement
      and have historically been kept separate from the shutdown entry
      points defined and its associated semantics.

Tanjore Suresh (3):
  driver core: Support asynchronous driver shutdown
  PCI: Support asynchronous shutdown
  nvme: Add async shutdown support

 drivers/base/core.c        | 38 +++++++++++++++++-
 drivers/nvme/host/core.c   | 28 +++++++++----
 drivers/nvme/host/nvme.h   |  8 ++++
 drivers/nvme/host/pci.c    | 80 ++++++++++++++++++++++++--------------
 drivers/pci/pci-driver.c   | 20 ++++++++--
 include/linux/device/bus.h | 12 ++++++
 include/linux/pci.h        |  4 ++
 7 files changed, 149 insertions(+), 41 deletions(-)

-- 
2.36.0.550.gb090851708-goog




More information about the Linux-nvme mailing list