[PATCH 0/3] nvme-multipath: fix deadlock in device_add_disk()
Hannes Reinecke
hare at kernel.org
Mon Oct 7 03:01:31 PDT 2024
Hi all,
I'm having a testcase which repeatedly disables namespaces on the target
assigning new UUID (to simulate namespace remapping) and enable that
namespace again.
To throw in more fun these namespaces have their ANA group ID changes
to simulate namespace moving around in a cluster, where only the paths
local to the cluster node are active, and all other paths are inaccessible.
Essentially it's doing something like:
echo 0 > ${ns}/enable
<random delay>
echo "<dev>" > ${ns}/device_path
echo "<grpid>" > ${ns}/ana_grpid
uuidgen > ${ns}/device_uuid
echo 1 > ${ns}/enable
ie a similar testcase than the previous patchset, only this time I'm
just doing an 'enable/disable' bit without removing the namespace from
the target.
This is causing lockups in device_add_disk(), as the partition scan is
constantly retrying I/O and never completes.
Funnily enough the very same issue should have been fixed with
ecca390e8056 ("nvme: fix deadlock in disconnect during scan_work
and/or ana_work"), but that fix seem to be imperfect.
As usual, comments and reviews are welcome.
Hannes Reinecke (3):
nvme-multipath: simplify loop in nvme_update_ana_state()
nvme-multipath: cannot disconnect controller on stuck partition scan
nvme-multipath: skip failed paths during partition scan
drivers/nvme/host/multipath.c | 51 ++++++++++++++++++++++++++---------
drivers/nvme/host/nvme.h | 1 +
2 files changed, 40 insertions(+), 12 deletions(-)
--
2.35.3
More information about the Linux-nvme
mailing list