[PATCH 2/3] nvme-multipath: cannot disconnect controller on stuck partition scan

Keith Busch kbusch at kernel.org
Mon Oct 7 11:19:23 PDT 2024


On Mon, Oct 07, 2024 at 12:01:33PM +0200, Hannes Reinecke wrote:
> @@ -239,6 +239,13 @@ static bool nvme_path_is_disabled(struct nvme_ns *ns)
>  {
>  	enum nvme_ctrl_state state = nvme_ctrl_state(ns->ctrl);
>  
> +	/*
> +	 * Skip deleted controllers for I/O from partition scan
> +	 */
> +	if (state == NVME_CTRL_DELETING &&
> +	    mutex_is_locked(&ns->ctrl->scan_lock))
> +		return true;

This feels off to me, using these seemingly unrelated dependencies to
make these kinds of decisions.

We talked a couple weeks ago about suppressing the parition scanning
during nvme scan_work. I know you said there was some reason it wouldn't
work, but could you check the below? It seems okay to me.

---
diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index 48e7a8906d012..82cb1eb3a773b 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -586,6 +586,12 @@ static void nvme_requeue_work(struct work_struct *work)
 		container_of(work, struct nvme_ns_head, requeue_work);
 	struct bio *bio, *next;
 
+	if (test_and_clear_bit(GD_SUPPRESS_PART_SCAN, &head->disk->state)) {
+		mutex_lock(&head->disk->open_mutex);
+		bdev_disk_changed(head->disk, false);
+		mutex_unlock(&head->disk->open_mutex);
+	}
+
 	spin_lock_irq(&head->requeue_lock);
 	next = bio_list_get(&head->requeue_list);
 	spin_unlock_irq(&head->requeue_lock);
@@ -629,6 +635,7 @@ int nvme_mpath_alloc_disk(struct nvme_ctrl *ctrl, struct nvme_ns_head *head)
 		return PTR_ERR(head->disk);
 	head->disk->fops = &nvme_ns_head_ops;
 	head->disk->private_data = head;
+	set_bit(GD_SUPPRESS_PART_SCAN, &head->disk->state);
 	sprintf(head->disk->disk_name, "nvme%dn%d",
 			ctrl->subsys->instance, head->instance);
 	return 0;
--



More information about the Linux-nvme mailing list