[PATCH 3/3] nvme-multipath: skip inaccessible paths during partition scan
Hannes Reinecke
hare at suse.de
Wed Sep 11 05:04:11 PDT 2024
On 9/11/24 13:24, Sagi Grimberg wrote:
> Hannes, if this patch fixes a bug, please phrase the title to reflect that.
>
>
> On 11/09/2024 12:51, Hannes Reinecke wrote:
>> When a path is switched to 'inaccessible' during partition scan
>> triggered via device_add_disk() and we only have one path the
>> system will be stuck as nvme_available_path() will always return
>> 'true'. So I/O will never be completed and the system is stuck
>> in device_add_disk().
>> This patch introduces a flag NVME_NSHEAD_DISABLE_QUEUEING to
>> cause nvme_available_path() to always return NULL, and with
>> that I/O to be failed if all paths are unavailable.
>
> So what will happen if a new device comes along, and the first
> path it connects to is 'inaccessible'? What will scan partitions?
>
No, it won't. But it won't do that today, neither, as we never call
nvme_mpath_set_disk_live():
nvme_mpath_add_disk()
-> nvme_update_ns_ana_state():
if (nvme_state_is_live(ns->ana_state) &&
nvme_ctrl_state(ns->ctrl) == NVME_CTRL_LIVE)
nvme_mpath_set_live(ns);
> I don't think that this approach is the right one.
> Effectively, there is a semantic decision here, is 'inaccessible' a
> temporary state or not, if it is we should queue IO knowing that
> it may change in the future, and if not, we should fail it.
>
Yes.
I would argue that we should never requeue I/O if it's triggered
from partition scan, ie from within device_add_disk().
Reasoning is as follows:
We never call nvme_mpath_set_live() during initial scan if all paths are
inaccessible.
So when a path is 'optimized' / 'non-optimized' and the path state
changes to 'inaccessible' while device_add_disk() is running, we are
perfectly fine to disable the device (and kill all I/O), as this is
what would have happened if the path had been inaccessible originally.
> ANA inaccessible is semantically temporary. Its just that your test
> treats it as permanent. For this case you have fast_io_fail_tmo, which is
> designed to give up also in cases where there is hope that any path will
> become online in the future.
>
It's not just ANA inaccessible. We're facing a similar problem if the
target returns PATH_ERROR during scanning; then we're constantly failing
over I/O to the next path, returning PATH_ERROR, failing over to the
next path, ...
> I do agree that if the user disconnects the last path, should it be
> inaccessible
> or not, it should succeed, and cause the queued IO to fail immediately.
I'll check if that would work for my testcases.
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare at suse.de +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich
More information about the Linux-nvme
mailing list