[PATCH 3/3] nvme: 'nvme disconnect' hangs after remapping namespaces

Sagi Grimberg sagi at grimberg.me
Tue Sep 10 00:57:15 PDT 2024




On 09/09/2024 10:19, Hannes Reinecke wrote:
> During repetitive namespace map and unmap operations on the target
> (disabling the namespace, changing the UUID, enabling it again)
> the initial scan will hang as the target will be returning
> PATH_ERROR and the I/O is constantly retried:
>
> [<0>] folio_wait_bit_common+0x12a/0x310
> [<0>] filemap_read_folio+0x97/0xd0
> [<0>] do_read_cache_folio+0x108/0x390
> [<0>] read_part_sector+0x31/0xa0
> [<0>] read_lba+0xc5/0x160
> [<0>] efi_partition+0xd9/0x8f0
> [<0>] bdev_disk_changed+0x23d/0x6d0
> [<0>] blkdev_get_whole+0x78/0xc0
> [<0>] bdev_open+0x2c6/0x3b0
> [<0>] bdev_file_open_by_dev+0xcb/0x120
> [<0>] disk_scan_partitions+0x5d/0x100
> [<0>] device_add_disk+0x402/0x420
> [<0>] nvme_mpath_set_live+0x4f/0x1f0 [nvme_core]
> [<0>] nvme_mpath_add_disk+0x107/0x120 [nvme_core]
> [<0>] nvme_alloc_ns+0xac6/0xe60 [nvme_core]
> [<0>] nvme_scan_ns+0x2dd/0x3e0 [nvme_core]
> [<0>] nvme_scan_work+0x1a3/0x490 [nvme_core]
>
> Calling 'nvme disconnect' on controllers with these namespaces
> will hang as the disconnect operation tries to flush scan_work:
>
> [<0>] __flush_work+0x389/0x4b0
> [<0>] nvme_remove_namespaces+0x4b/0x130 [nvme_core]
> [<0>] nvme_do_delete_ctrl+0x72/0x90 [nvme_core]
> [<0>] nvme_delete_ctrl_sync+0x2e/0x40 [nvme_core]
> [<0>] nvme_sysfs_delete+0x35/0x40 [nvme_core]
> [<0>] kernfs_fop_write_iter+0x13d/0x1b0
> [<0>] vfs_write+0x404/0x510
>
> before the namespaces are removed, and the controller state
> DELETING_NOIO (which would abort any pending I/O) is set only
> afterwards.
>
> This patch calls 'nvme_kick_requeue_lists()' when entering
> DELETING state for a controller to ensure all pending I/O
> is flushed, and also disables failover for any commands which
> are completed with an error afterwards, breaking the infinite
> retry loop.
>
> Signed-off-by: Hannes Reinecke <hare at kernel.org>
> ---
>   drivers/nvme/host/core.c | 7 ++++++-
>   1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 651073280f6f..142babce1963 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -381,6 +381,8 @@ enum nvme_disposition {
>   
>   static inline enum nvme_disposition nvme_decide_disposition(struct request *req)
>   {
> +	struct nvme_ctrl *ctrl = nvme_req(req)->ctrl;
> +
>   	if (likely(nvme_req(req)->status == 0))
>   		return COMPLETE;
>   
> @@ -393,6 +395,8 @@ static inline enum nvme_disposition nvme_decide_disposition(struct request *req)
>   		return AUTHENTICATE;
>   
>   	if (req->cmd_flags & REQ_NVME_MPATH) {
> +		if (nvme_ctrl_state(ctrl) == NVME_CTRL_DELETING)
> +			return COMPLETE;

This looks wrong. What if I disconnect from one path and there are other
eligible paths?



More information about the Linux-nvme mailing list