[PATCH v3 for-5.8-rc 3/5] nvme: don't protect ns mutation with ns->head->lock

Hannes Reinecke hare at suse.de
Fri Jun 26 11:55:16 EDT 2020


On 6/24/20 10:53 AM, Sagi Grimberg wrote:
> Right now ns->head->lock is protecting namespace mutation
> which is wrong and unneeded. Move it to only protect
> against head mutations. While we're at it, remove unnecessary
> ns->head reference as we already have head pointer.
> 
> The problem with this is that the head->lock spans
> mpath disk node I/O that may block under some conditions (if
> for example the controller is disconnecting or the path
> became inaccessible), The locking scheme does not allow any
> other path to enable itself, preventing blocked I/O to complete
> and forward-progress from there.
> 
> This is a preparation patch for the fix in a subsequent patch
> where the disk I/O will also be done outside the head->lock.
> 
> Fixes: 0d0b660f214d ("nvme: add ANA support")
> Signed-off-by: Anton Eidelman <anton at lightbitslabs.com>
> Signed-off-by: Sagi Grimberg <sagi at grimberg.me>
> ---
>   drivers/nvme/host/multipath.c | 12 ++++--------
>   1 file changed, 4 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
> index 0ef183c1153e..5c48a38aebf8 100644
> --- a/drivers/nvme/host/multipath.c
> +++ b/drivers/nvme/host/multipath.c
> @@ -409,11 +409,10 @@ static void nvme_mpath_set_live(struct nvme_ns *ns)
>   {
>   	struct nvme_ns_head *head = ns->head;
>   
> -	lockdep_assert_held(&ns->head->lock);
> -
>   	if (!head->disk)
>   		return;
>   
> +	mutex_lock(&head->lock);
>   	if (!(head->disk->flags & GENHD_FL_UP))
>   		device_add_disk(&head->subsys->dev, head->disk,
>   				nvme_ns_id_attr_groups);
> @@ -426,9 +425,10 @@ static void nvme_mpath_set_live(struct nvme_ns *ns)
>   			__nvme_find_path(head, node);
>   		srcu_read_unlock(&head->srcu, srcu_idx);
>   	}
> +	mutex_unlock(&head->lock);
>   
> -	synchronize_srcu(&ns->head->srcu);
> -	kblockd_schedule_work(&ns->head->requeue_work);
> +	synchronize_srcu(&head->srcu);
> +	kblockd_schedule_work(&head->requeue_work);
>   }
>   
>   static int nvme_parse_ana_log(struct nvme_ctrl *ctrl, void *data,
> @@ -483,14 +483,12 @@ static inline bool nvme_state_is_live(enum nvme_ana_state state)
>   static void nvme_update_ns_ana_state(struct nvme_ana_group_desc *desc,
>   		struct nvme_ns *ns)
>   {
> -	mutex_lock(&ns->head->lock);
>   	ns->ana_grpid = le32_to_cpu(desc->grpid);
>   	ns->ana_state = desc->state;
>   	clear_bit(NVME_NS_ANA_PENDING, &ns->flags);
>   
>   	if (nvme_state_is_live(ns->ana_state))
>   		nvme_mpath_set_live(ns);
> -	mutex_unlock(&ns->head->lock);
>   }
>   
>   static int nvme_update_ana_state(struct nvme_ctrl *ctrl,
> @@ -669,10 +667,8 @@ void nvme_mpath_add_disk(struct nvme_ns *ns, struct nvme_id_ns *id)
>   			nvme_update_ns_ana_state(&desc, ns);
>   		}
>   	} else {
> -		mutex_lock(&ns->head->lock);
>   		ns->ana_state = NVME_ANA_OPTIMIZED;
>   		nvme_mpath_set_live(ns);
> -		mutex_unlock(&ns->head->lock);
>   	}
>   
>   	if (bdi_cap_stable_pages_required(ns->queue->backing_dev_info)) {
> 
But what provides synchronizsation of the updated namespace values?
And if it doesn't require synchronisation it would need some explanation 
why not.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke            Teamlead Storage & Networking
hare at suse.de                               +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer



More information about the Linux-nvme mailing list