[RFC-PATCH 2/2] nvme: use the namespace id for block device names
Nilay Shroff
nilay at linux.ibm.com
Wed Mar 4 03:55:28 PST 2026
On 3/3/26 10:23 PM, Keith Busch wrote:
> On Tue, Mar 03, 2026 at 08:39:56AM -0700, Keith Busch wrote:
>> On Tue, Mar 03, 2026 at 08:34:38AM +0100, Hannes Reinecke wrote:
>>> The idea is nice, and I would love to go into that
>>> direction.
>>> But have you checked how this holds up under
>>> rescan/remapping (eg things like blktest/nvme/058)?
>>> Removal of the sysfs nodes might be delayed, and we cannot
>>> create new entries with the same name until then.
>>> So if that is taken care of, fine, but I don't see that in
>>> the patch ...
>>
>> I see. I set out to ensure everything was ordered, but apparently I've
>> missed this case. Thanks for pointing out the test.
>
> Okay, I think it's as simple as the head unlinking prior to actually
> doing the del_gendisk creates a time when one scan work makes a new
> namespace before the stale one is deleted in a different scan. This
> should fix it:
>
> ---
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 3f2f9b2be87c2..e8dbb6cb85694 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -4246,11 +4246,8 @@ static void nvme_ns_remove(struct nvme_ns *ns)
>
> mutex_lock(&ns->ctrl->subsys->lock);
> list_del_rcu(&ns->siblings);
> - if (list_empty(&ns->head->list)) {
> - if (!nvme_mpath_queue_if_no_path(ns->head))
> - list_del_init(&ns->head->entry);
> + if (list_empty(&ns->head->list))
> last_path = true;
> - }
> mutex_unlock(&ns->ctrl->subsys->lock);
>
> /* guarantee not available in head->list */
> --
>
> This opens a different race where Controller A is deleting the last
> path, and Controller B is bringing up a new namespace that reused the
> NSID. An earlier patch from me should fix that by reschudeling B's
> scan_work once A detects it removed the last path. It should work, but
> it feels a bit off to me, so I'll think about it a little more.
>
I think we need to ensure that the head disk is fully removed from the
system before a new head is created that may reuse the same NSID;
otherwise we may run into name collisions. Since removal of the sysfs
entries associated with the disk can be delayed, a namespace scan may
attempt to create a new head with the same name while the previous one
is still being torn down.
To avoid this, it might be better to defer removing head->entry from the
subsys->nsheads list until after the head node’s gendisk has been
removed in nvme_remove_head().
From your first patch in this series, I see that the removal of
head->entry from subsys->nsheads was moved to the beginning of
nvme_remove_head(). IMO, in addition to the change you proposed above,
if we also move the removal of head->entry until after del_gendisk() is
called, that should ensure the old disk is fully removed before a new
head with the same NSID can be created.
Thanks,
--Nilay
More information about the Linux-nvme
mailing list