[PATCH v10 0/5] shut down devices asynchronously

Laurence Oberman loberman at redhat.com
Mon Jun 30 15:02:02 PDT 2025


On Mon, 2025-06-30 at 20:33 +0000, Michael Kelley wrote:
> From: Stuart Hayes <stuart.w.hayes at gmail.com> Sent: Wednesday, June
> 25, 2025 1:19 PM
> > 
> > This adds the ability for the kernel to shutdown devices
> > asynchronously.
> > 
> > Only devices with drivers that enable it are shut down
> > asynchronously.
> > 
> > This can dramatically reduce system shutdown/reboot time on systems
> > that
> > have multiple devices that take many seconds to shut down (like
> > certain
> > NVMe drives). On one system tested, the shutdown time went from 11
> > minutes
> > without this patch to 55 seconds with the patch.
> 
> I've tested this version and all looks good. I did the same tests
> that I did
> with v9 [1], running in a VM in the Azure cloud. The 2 NVMe devices
> are
> shutdown in parallel, gaining about 110 milliseconds, and there were
> no
> slowdowns as seen in v9. The net gain was ~100 ms.
> 
> I also tested a local Hyper-V VM that does not have any NVMe devices.
> The shutdown timings with and without this patch set are pretty much
> the same, which was not the case with v9.
> 
> I did not repeat the more detailed debugging from v9 as reported
> here [2], since there is no unexpected slowness with v10.
> 
> For the series,
> 
> Tested-by: Michael Kelley <mhklinux at outlook.com>
> 
> [1]
> https://lore.kernel.org/lkml/BN7PR02MB41480DE777B9C224F3C2DF43D4792@BN7PR02MB4148.namprd02.prod.outlook.com/
> [2]
> https://lore.kernel.org/lkml/SN6PR02MB41571E2DD410D09CE7494B38D4402@SN6PR02MB4157.namprd02.prod.outlook.com/
> 
> > 
> > Changes from V9:
> > 
> > Address resource and timing issues when spawning a unique async
> > thread
> > for every device during shutdown:
> >   * Make the asynchronous threads able to shut down multiple
> > devices,
> >     instead of spawning a unique thread for every device.
> >   * Modify core kernel async code with a custom wake function so it
> >     doesn't wake up threads waiting to synchronize every time the
> > cookie
> >     changes
> > 
> > Changes from V8:
> > 
> > Deal with shutdown hangs resulting when a parent/supplier device is
> >   later in the devices_kset list than its children/consumers:
> >   * Ignore sync_state_only devlinks for shutdown dependencies
> >   * Ignore shutdown_after for devices that don't want async
> > shutdown
> >   * Add a sanity check to revert to sync shutdown for any device
> > that
> >     would otherwise wait for a child/consumer shutdown that hasn't
> >     already been scheduled
> > 
> > Changes from V7:
> > 
> > Do not expose driver async_shutdown_enable in sysfs.
> > Wrapped a long line.
> > 
> > Changes from V6:
> > 
> > Removed a sysfs attribute that allowed the async device shutdown to
> > be
> > "on" (with driver opt-out), "safe" (driver opt-in), or "off"...
> > what was
> > previously "safe" is now the only behavior, so drivers now only
> > need to
> > have the option to enable or disable async shutdown.
> > 
> > Changes from V5:
> > 
> > Separated into multiple patches to make review easier.
> > Reworked some code to make it more readable
> > Made devices wait for consumers to shut down, not just children
> >   (suggested by David Jeffery)
> > 
> > Changes from V4:
> > 
> > Change code to use cookies for synchronization rather than async
> > domains
> > Allow async shutdown to be disabled via sysfs, and allow driver
> > opt-in or
> >   opt-out of async shutdown (when not disabled), with ability to
> > control
> >   driver opt-in/opt-out via sysfs
> > 
> > Changes from V3:
> > 
> > Bug fix (used "parent" not "dev->parent" in device_shutdown)
> > 
> > Changes from V2:
> > 
> > Removed recursive functions to schedule children to be shutdown
> > before
> >   parents, since existing device_shutdown loop will already do this
> > 
> > Changes from V1:
> > 
> > Rewritten using kernel async code (suggested by Lukas Wunner)
> > 
> > David Jeffery (1):
> >   kernel/async: streamline cookie synchronization
> > 
> > Stuart Hayes (4):
> >   driver core: don't always lock parent in shutdown
> >   driver core: separate function to shutdown one device
> >   driver core: shut down devices asynchronously
> >   nvme-pci: Make driver prefer asynchronous shutdown
> > 
> >  drivers/base/base.h           |   8 ++
> >  drivers/base/core.c           | 210 +++++++++++++++++++++++++++++-
> > ----
> >  drivers/nvme/host/pci.c       |   1 +
> >  include/linux/device/driver.h |   2 +
> >  kernel/async.c                |  42 ++++++-
> >  5 files changed, 236 insertions(+), 27 deletions(-)
> > 
> > --
> > 2.39.3
> > 
> 
> 

For the series:

Against 
Kernel 6.16.0-rc4-dirty on an x86_64

Difference of about 15 seconds to shutdown compared to almost 60
Same set of test I always run and stable and repeatable

Looks good again, although V9 also looked good until Mike Kelley found
his issues.
 
Tested-by: Laurence Oberman <loberman at redhat.com>





More information about the Linux-nvme mailing list