[PATCH 3/3] nvme: 'nvme disconnect' hangs after remapping namespaces

Hannes Reinecke hare at suse.de
Fri Sep 6 03:42:50 PDT 2024


On 9/6/24 12:35, Damien Le Moal wrote:
> On 9/6/24 19:16, Hannes Reinecke wrote:
>> During repetitive namespace map and unmap operations on the target
>> (disabling the namespace, changing the UUID, enabling it again)
>> the initial scan will hang as the target will be returning
>> PATH_ERROR and the I/O is constantly retried:
>>
>> [<0>] folio_wait_bit_common+0x12a/0x310
>> [<0>] filemap_read_folio+0x97/0xd0
>> [<0>] do_read_cache_folio+0x108/0x390
>> [<0>] read_part_sector+0x31/0xa0
>> [<0>] read_lba+0xc5/0x160
>> [<0>] efi_partition+0xd9/0x8f0
>> [<0>] bdev_disk_changed+0x23d/0x6d0
>> [<0>] blkdev_get_whole+0x78/0xc0
>> [<0>] bdev_open+0x2c6/0x3b0
>> [<0>] bdev_file_open_by_dev+0xcb/0x120
>> [<0>] disk_scan_partitions+0x5d/0x100
>> [<0>] device_add_disk+0x402/0x420
>> [<0>] nvme_mpath_set_live+0x4f/0x1f0 [nvme_core]
>> [<0>] nvme_mpath_add_disk+0x107/0x120 [nvme_core]
>> [<0>] nvme_alloc_ns+0xac6/0xe60 [nvme_core]
>> [<0>] nvme_scan_ns+0x2dd/0x3e0 [nvme_core]
>> [<0>] nvme_scan_work+0x1a3/0x490 [nvme_core]
>>
>> Calling 'nvme disconnect' on controllers with these namespaces
>> will hang as the disconnect operation tries to flush scan_work:
>>
>> [<0>] __flush_work+0x389/0x4b0
>> [<0>] nvme_remove_namespaces+0x4b/0x130 [nvme_core]
>> [<0>] nvme_do_delete_ctrl+0x72/0x90 [nvme_core]
>> [<0>] nvme_delete_ctrl_sync+0x2e/0x40 [nvme_core]
>> [<0>] nvme_sysfs_delete+0x35/0x40 [nvme_core]
>> [<0>] kernfs_fop_write_iter+0x13d/0x1b0
>> [<0>] vfs_write+0x404/0x510
>>
>> before the namespaces are removed.
>>
>> This patch sets the 'failfast_expired' bit for the controller
>> to cause all pending I/O to be failed, and the disconnect process
>> to complete.
>>
>> Signed-off-by: Hannes Reinecke <hare at kernel.org>
>> ---
>>   drivers/nvme/host/core.c | 7 +++++++
>>   1 file changed, 7 insertions(+)
>>
>> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
>> index 651073280f6f..b968b672dcf8 100644
>> --- a/drivers/nvme/host/core.c
>> +++ b/drivers/nvme/host/core.c
>> @@ -4222,6 +4222,13 @@ void nvme_remove_namespaces(struct nvme_ctrl *ctrl)
>>   	 */
>>   	nvme_mpath_clear_ctrl_paths(ctrl);
>>   
>> +	/*
>> +	 * Mark the controller as 'failfast' to ensure all pending I/O
>> +	 * is killed.
> 
> s/is/are ?
> 
Depending whether you consider I/O as a stream or individual commands.
'.. ensure all pending commands are killed.' ?

Cheers,

Hannes

-- 
Dr. Hannes Reinecke                  Kernel Storage Architect
hare at suse.de                                +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich




More information about the Linux-nvme mailing list