[PATCH 3/3] nvme-multipath: stuck partition scan on inaccessible paths
Hannes Reinecke
hare at suse.de
Wed Sep 11 09:10:26 PDT 2024
On 9/11/24 17:20, Sagi Grimberg wrote:
>
>
>
> On 11/09/2024 15:46, Hannes Reinecke wrote:
>> When a path is switched to 'inaccessible' during partition scan
>> triggered via device_add_disk() and we only have one path the
>> system will be stuck as nvme_available_path() will always return
>> 'true'. So I/O will never be completed and the system is stuck
>> in device_add_disk():
>>
>> [<0>] folio_wait_bit_common+0x12a/0x310
>> [<0>] filemap_read_folio+0x97/0xd0
>> [<0>] do_read_cache_folio+0x108/0x390
>> [<0>] read_part_sector+0x31/0xa0
>> [<0>] read_lba+0xc5/0x160
>> [<0>] efi_partition+0xd9/0x8f0
>> [<0>] bdev_disk_changed+0x23d/0x6d0
>> [<0>] blkdev_get_whole+0x78/0xc0
>> [<0>] bdev_open+0x2c6/0x3b0
>> [<0>] bdev_file_open_by_dev+0xcb/0x120
>> [<0>] disk_scan_partitions+0x5d/0x100
>> [<0>] device_add_disk+0x402/0x420
>> [<0>] nvme_mpath_set_live+0x4f/0x1f0 [nvme_core]
>> [<0>] nvme_mpath_add_disk+0x107/0x120 [nvme_core]
>> [<0>] nvme_alloc_ns+0xac6/0xe60 [nvme_core]
>> [<0>] nvme_scan_ns+0x2dd/0x3e0 [nvme_core]
>> [<0>] nvme_scan_work+0x1a3/0x490 [nvme_core]
>>
>> This patch introduces a flag NVME_NSHEAD_FAIL_ON_LAST_PATH to
>> cause nvme_available_path() to always return NULL, and with
>> that I/O to be failed if the last path is unavailable. But
>> we also need to requeue all pending I/Os whenever we have
>> changed the ANA state as now even inaccessible ANA states
>> influence I/O behaviour.
>>
>> Signed-off-by: Hannes Reinecke <hare at kernel.org>
>> ---
>> drivers/nvme/host/multipath.c | 15 +++++++++++++++
>> drivers/nvme/host/nvme.h | 1 +
>> 2 files changed, 16 insertions(+)
>>
>> diff --git a/drivers/nvme/host/multipath.c
>> b/drivers/nvme/host/multipath.c
>> index f72c5a6a2d8e..0373b4043eea 100644
>> --- a/drivers/nvme/host/multipath.c
>> +++ b/drivers/nvme/host/multipath.c
>> @@ -422,6 +422,10 @@ static bool nvme_available_path(struct
>> nvme_ns_head *head)
>> struct nvme_ns *ns;
>> if (!test_bit(NVME_NSHEAD_DISK_LIVE, &head->flags))
>> + return NULL;
>> +
>> + if (test_bit(NVME_NSHEAD_FAIL_ON_LAST_PATH, &head->flags) &&
>> + list_is_singular(&head->list))
>> return NULL;
>
> This is wrong too Hannes.
>
> If you added a live path, and when partition scan triggers your path
> becomes
> inaccessible, but then after anatt it becomes optimized again. You just
> ended up without discovering partitions...
But we will be getting an ANA state change AEN in that case, seeing that
we have enabled ANA state change events and the spec states that we
should be getting them (NVMe 1.3, section 8.21.4 Host ANA Transition
Operation explicitly has an 'or' between point a) and b) ...).
Which is what we rely upon anyway, otherwise we would have had to check
for INACCESSIBLE in nvme_update_ana_state, and not just for CHANGE as we
do now.
Plus the situation will be exactly the same if one uses 'persistent
loss' instead of 'inaccessible'...
And incidentally, the same issue happens when you disable the namespace
in nvmet just after the namespace got scanned, but before
device_add_disk() is called. So really it's not about ANA, but rather
about any path-related error here.
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare at suse.de +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich
More information about the Linux-nvme
mailing list