[PATCH 3/3] nvme-multipath: skip failed paths during partition scan

Christoph Hellwig hch at lst.de
Mon Oct 7 23:40:20 PDT 2024


On Mon, Oct 07, 2024 at 12:01:34PM +0200, Hannes Reinecke wrote:
> From: Hannes Reinecke <hare at suse.de>
> 
> When an I/O error is encountered during scanning (ie when the
> scan_lock is held) we should avoid using this path until scanning
> is finished to avoid deadlocks with device_add_disk().
> So set a new flag NVME_NS_SCAN_FAILED if a failover happened during
> scanning, and skip this path in nvme_available_paths().
> Then we can check if that bit is set after device_add_disk() returned,
> and remove the disk again if no available paths are found.
> That allows the device to be recreated via the 'rescan' sysfs attribute
> once no I/O errors occur anymore.
> 
> Signed-off-by: Hannes Reinecke <hare at kernel.org>
> ---
>  drivers/nvme/host/multipath.c | 26 ++++++++++++++++++++++++++
>  drivers/nvme/host/nvme.h      |  1 +
>  2 files changed, 27 insertions(+)
> 
> diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
> index f03ef983a75f..4113d38606a4 100644
> --- a/drivers/nvme/host/multipath.c
> +++ b/drivers/nvme/host/multipath.c
> @@ -102,6 +102,13 @@ void nvme_failover_req(struct request *req)
>  		queue_work(nvme_wq, &ns->ctrl->ana_work);
>  	}
>  
> +	/*
> +	 * Do not use this path during scanning
> +	 * to avoid deadlocks in device_add_disk()
> +	 */
> +	if (mutex_is_locked(&ns->ctrl->scan_lock))
> +		set_bit(NVME_NS_SCAN_FAILED, &ns->flags);

Err, no.  mutex_is_locked is never a valid way to detect the calling
context - obviously someone else could also be holding it.




More information about the Linux-nvme mailing list