[PATCH 2/2] nvme: add 'queue_if_no_path' semantics

Christoph Hellwig hch at lst.de
Mon Oct 5 08:52:01 EDT 2020


On Mon, Oct 05, 2020 at 02:45:00PM +0200, Hannes Reinecke wrote:
> Currently namespaces behave differently depending on the 'CMIC'
> setting. If CMIC is zero, the device is removed once the last path
> goes away. If CMIC has the multipath bit set, the device is retained
> even if the last path is removed.
> This is okay for fabrics, where one can do an explicit disconnect
> to remove the device, but for nvme-pci this induces a regression
> with PCI hotplug.
> When the NVMe device is opened (eg by MD), the NVMe device is not
> removed after a PCI hot-remove. Hence MD will not be notified about
> the event, and will continue to consider this device as operational.
> Consequently, upon PCI hot-add the device shows up as a new NVMe
> device, and MD will fail to reattach the device.
> So this patch adds NVME_NSHEAD_QUEUE_IF_NO_PATH flag to the nshead
> to restore the original behaviour for non-fabrics NVMe devices.
> 
> Signed-off-by: Hannes Reinecke <hare at suse.de>
> ---
>  drivers/nvme/host/core.c      | 10 +++++++++-
>  drivers/nvme/host/multipath.c | 38 ++++++++++++++++++++++++++++++++++++++
>  drivers/nvme/host/nvme.h      |  2 ++
>  3 files changed, 49 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 4459a40b057c..e21c32ea4b51 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -475,8 +475,11 @@ static void nvme_free_ns_head(struct kref *ref)
>  		container_of(ref, struct nvme_ns_head, ref);
>  
>  #ifdef CONFIG_NVME_MULTIPATH
> -	if (head->disk)
> +	if (head->disk) {
> +		if (test_bit(NVME_NSHEAD_QUEUE_IF_NO_PATH, &head->flags))
> +			nvme_mpath_remove_disk(head);
>  		put_disk(head->disk);
> +	}
>  #endif
>  	ida_simple_remove(&head->subsys->ns_ida, head->instance);
>  	cleanup_srcu_struct(&head->srcu);
> @@ -3357,6 +3360,7 @@ static struct attribute *nvme_ns_id_attrs[] = {
>  #ifdef CONFIG_NVME_MULTIPATH
>  	&dev_attr_ana_grpid.attr,
>  	&dev_attr_ana_state.attr,
> +	&dev_attr_queue_if_no_path.attr,
>  #endif
>  	NULL,
>  };
> @@ -3387,6 +3391,10 @@ static umode_t nvme_ns_id_attrs_are_visible(struct kobject *kobj,
>  		if (!nvme_ctrl_use_ana(nvme_get_ns_from_dev(dev)->ctrl))
>  			return 0;
>  	}
> +	if (a == &dev_attr_queue_if_no_path.attr) {
> +		if (dev_to_disk(dev)->fops == &nvme_fops)
> +			return 0;
> +	}
>  #endif
>  	return a->mode;
>  }
> diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
> index 55045291b4de..bbdad5917112 100644
> --- a/drivers/nvme/host/multipath.c
> +++ b/drivers/nvme/host/multipath.c
> @@ -381,6 +381,9 @@ int nvme_mpath_alloc_disk(struct nvme_ctrl *ctrl, struct nvme_ns_head *head)
>  	/* set to a default value for 512 until disk is validated */
>  	blk_queue_logical_block_size(q, 512);
>  	blk_set_stacking_limits(&q->limits);
> +	/* Enable queue_if_no_path semantics for fabrics */
> +	if (ctrl->ops->flags & NVME_F_FABRICS)
> +		set_bit(NVME_NSHEAD_QUEUE_IF_NO_PATH, &head->flags);

Well, that is blindly obvious from the code.  But why would we treat
fabrics special? 



More information about the Linux-nvme mailing list