[PATCHv3 0/4] nvme: NSHEAD_DISK_LIVE fixes
Hannes Reinecke
hare at kernel.org
Fri Sep 6 00:18:24 PDT 2024
Hi all,
I'm having a testcase which repeatedly deletes namespaces on the target
and creates new namespaces, and aggressively re-using NSIDs for the
new namespaces.
To throw in more fun these namespaces are created on different nodes
in the cluster, where only the paths local to the cluster node are
active, and all other paths are inaccessible.
Essentially it's doing something like:
echo 0 > ${ns}/enable
rm ${ns}
<random delay>
mkdir ${ns}
echo "<dev>" > ${ns}/device_path
echo "<grpid>" > ${ns}/ana_grpid
uuidgen > ${ns}/device_uuid
echo 1 > ${ns}/enable
repeatedly with several namespaces and several ANA groups.
This leads to an unrecoverable system where the scanning processes
are stuck in the partition scanning code triggered via
'device_add_disk()' waiting for I/O which will never
come.
There are two parts to fixing this:
We need to ensure the NSHEAD_DISK_LIVE is properly set when the
ns_head is live, and unset when the last path is gone.
And we need to trigger the requeue list after NSHEAD_DISK_LIVE
has been cleared to flush all outstanding I/O.
Turns out there's another corner case; when running the same test
but not removing the namespaces while changing the UUID we end up
with I/Os constantly being retried, and we are unable to even
disconnect the controller. To fix this we should set the
'failfast' flag for the controller when disconnecting to ensure
that all I/O is aborted.
With these patches (and the queue freeze patchset from hch) the problem
is resolved and the testcase runs without issues.
I see to get the testcase added to blktests.
As usual, comments and reviews are welcome.
Changes to v2:
- Include reviews from Sagi
- Drop the check for NSHEAD_DISK_LIVE in nvme_available_path()
- Add a patch to requeue I/O if the ANA state changed
- Set the 'failfast' flag when removing controllers
Changes to the original submission:
- Drop patch to remove existing namespaces on ID mismatch
- Combine patches updating NSHEAD_DISK_LIVE handling
- requeue I/O after NSHEAD_DISK_LIVE has been cleared
Hannes Reinecke (4):
nvme-multipath: fixup typo when clearing DISK_LIVE
nvme-multipath: check for NVME_NSHEAD_DISK_LIVE when selecting paths
nvme-multipath: always requeue I/O when updating the ANA state
nvme: set 'failfast_expired' in nvme_remove_namespaces()
drivers/nvme/host/core.c | 7 +++++++
drivers/nvme/host/multipath.c | 23 +++++++++++++++++------
2 files changed, 24 insertions(+), 6 deletions(-)
--
2.35.3
More information about the Linux-nvme
mailing list