[PATCH 3/3] nvme-multipath: skip inaccessible paths during partition scan
Sagi Grimberg
sagi at grimberg.me
Wed Sep 11 04:24:25 PDT 2024
Hannes, if this patch fixes a bug, please phrase the title to reflect that.
On 11/09/2024 12:51, Hannes Reinecke wrote:
> When a path is switched to 'inaccessible' during partition scan
> triggered via device_add_disk() and we only have one path the
> system will be stuck as nvme_available_path() will always return
> 'true'. So I/O will never be completed and the system is stuck
> in device_add_disk().
> This patch introduces a flag NVME_NSHEAD_DISABLE_QUEUEING to
> cause nvme_available_path() to always return NULL, and with
> that I/O to be failed if all paths are unavailable.
So what will happen if a new device comes along, and the first
path it connects to is 'inaccessible'? What will scan partitions?
I don't think that this approach is the right one.
Effectively, there is a semantic decision here, is 'inaccessible' a
temporary state or not, if it is we should queue IO knowing that
it may change in the future, and if not, we should fail it.
ANA inaccessible is semantically temporary. Its just that your test
treats it as permanent. For this case you have fast_io_fail_tmo, which is
designed to give up also in cases where there is hope that any path will
become online in the future.
I do agree that if the user disconnects the last path, should it be
inaccessible
or not, it should succeed, and cause the queued IO to fail immediately.
More information about the Linux-nvme
mailing list