[PATCH 3/3] nvme-multipath: skip failed paths during partition scan
Christoph Hellwig
hch at lst.de
Mon Oct 7 23:40:20 PDT 2024
On Mon, Oct 07, 2024 at 12:01:34PM +0200, Hannes Reinecke wrote:
> From: Hannes Reinecke <hare at suse.de>
>
> When an I/O error is encountered during scanning (ie when the
> scan_lock is held) we should avoid using this path until scanning
> is finished to avoid deadlocks with device_add_disk().
> So set a new flag NVME_NS_SCAN_FAILED if a failover happened during
> scanning, and skip this path in nvme_available_paths().
> Then we can check if that bit is set after device_add_disk() returned,
> and remove the disk again if no available paths are found.
> That allows the device to be recreated via the 'rescan' sysfs attribute
> once no I/O errors occur anymore.
>
> Signed-off-by: Hannes Reinecke <hare at kernel.org>
> ---
> drivers/nvme/host/multipath.c | 26 ++++++++++++++++++++++++++
> drivers/nvme/host/nvme.h | 1 +
> 2 files changed, 27 insertions(+)
>
> diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
> index f03ef983a75f..4113d38606a4 100644
> --- a/drivers/nvme/host/multipath.c
> +++ b/drivers/nvme/host/multipath.c
> @@ -102,6 +102,13 @@ void nvme_failover_req(struct request *req)
> queue_work(nvme_wq, &ns->ctrl->ana_work);
> }
>
> + /*
> + * Do not use this path during scanning
> + * to avoid deadlocks in device_add_disk()
> + */
> + if (mutex_is_locked(&ns->ctrl->scan_lock))
> + set_bit(NVME_NS_SCAN_FAILED, &ns->flags);
Err, no. mutex_is_locked is never a valid way to detect the calling
context - obviously someone else could also be holding it.
More information about the Linux-nvme
mailing list