[PATCH] nvme: allow to re-attach namespaces after all paths are down
Hannes Reinecke
hare at suse.de
Tue May 11 23:05:38 PDT 2021
On 5/11/21 7:02 PM, Sagi Grimberg wrote:
>
>>>>> @@ -3605,16 +3608,26 @@ static int nvme_init_ns_head(struct nvme_ns
>>>>> *ns, unsigned nsid,
>>>>> head->shared = is_shared;
>>>>> } else {
>>>>> ret = -EINVAL;
>>>>> - if (!is_shared || !head->shared) {
>>>>> - dev_err(ctrl->device,
>>>>> - "Duplicate unshared namespace %d\n", nsid);
>>>>> - goto out_put_ns_head;
>>>>> - }
>>>>> - if (!nvme_ns_ids_equal(&head->ids, ids)) {
>>>>> - dev_err(ctrl->device,
>>>>> - "IDs don't match for shared namespace %d\n",
>>>>> + /*
>>>>> + * If multipath is enabled we might hit an ns head with no
>>>>> + * paths, but that doesn't indicate it's a shared namespace.
>>>>> + */
>>>>> + if (!nvme_ns_head_multipath(head) ||
>>>>> + !list_empty(&head->list)) {
>>>>> + if (!is_shared || !head->shared) {
>>>>> + dev_err(ctrl->device,
>>>>> + "Duplicate unshared namespace %d\n", nsid);
>>>>> + goto out_put_ns_head;
>>>>> + }
>>>>
>>>> If not multipath, then it is not shared. The above will fail attaching
>>>> single-path namespaces to a known head.
>>>>
>>>> The rest is similar to something I was working on too, though, so I
>>>> think it's the right direction.
>>>>
>>>>
>>>>> + if (!nvme_ns_ids_equal(&head->ids, ids)) {
>>>>> + dev_err(ctrl->device,
>>>>> + "IDs don't match for shared namespace %d\n",
>>>>> nsid);
>>>>> - goto out_put_ns_head;
>>>>> + goto out_put_ns_head;
>>>>> + }
>>>>> + } else {
>>>>> + /* But the ids might have changed, so reset them */
>>>>> + head->ids = *ids;
>>>>> }
>>>>> }
>>>>> @@ -3764,8 +3777,6 @@ static void nvme_ns_remove(struct nvme_ns *ns)
>>>>> mutex_lock(&ns->ctrl->subsys->lock);
>>>>> list_del_rcu(&ns->siblings);
>>>>> - if (list_empty(&ns->head->list))
>>>>> - list_del_init(&ns->head->entry);
>>>
>>> Hannes, you sent a patch like this before, my comment was that a nshead
>>> should be removed before the final reference (which who knows when it
>>> will ever arrive) as if a nsid were to be reused by the controller for
>>> a different namespace then we'd reject it so I'm not sure how this
>>> helps?
>>>
>>
>> But we _need_ to have the nshead available in the linked list if we
>> ever want to re-attach a namespace after reconnecting the controller.
>
> Again, this is a queue_if_no_path functionality, this sequence is
> correct for that.
>
We so need to have a face-to-face meeting for this will all concerned
parties ...
Why do we need queue_if_no_path for this?
We do _not_ queue I/O in any shape or form; in fact, MD will register
I/O errors on the failed nshead just fine. And my understanding for
'queue_if_no_path' is that I/O will be held until the underlying paths
gets reattached. But this is _not_ what this patch does.
This patch merely deals with the situation when a controller is
reattached after an all-paths-removed scenario.
And as the nshead still _is_ present we should use it, precisely to
avoid the situation where the NSID changes underneath us.
>> And we should be able to catch the 'reuse namespace' issue by checking
>> the contents of the nsids field, which we keep attached to the nshead.
>
> Well, you may be right, what I'm concerned about is that a new namespace
> may be falsely identified as the same namespace because it is
> re-attaching to a lingering mounted mpath device.
Please check v3. Your concern should be addressed there.
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare at suse.de +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer
More information about the Linux-nvme
mailing list