[RFC-PATCH 2/2] nvme: use the namespace id for block device names

Nilay Shroff nilay at linux.ibm.com
Wed Mar 4 03:55:28 PST 2026


On 3/3/26 10:23 PM, Keith Busch wrote:
> On Tue, Mar 03, 2026 at 08:39:56AM -0700, Keith Busch wrote:
>> On Tue, Mar 03, 2026 at 08:34:38AM +0100, Hannes Reinecke wrote:
>>> The idea is nice, and I would love to go into that
>>> direction.
>>> But have you checked how this holds up under
>>> rescan/remapping (eg things like blktest/nvme/058)?
>>> Removal of the sysfs nodes might be delayed, and we cannot
>>> create new entries with the same name until then.
>>> So if that is taken care of, fine, but I don't see that in
>>> the patch ...
>>
>> I see. I set out to ensure everything was ordered, but apparently I've
>> missed this case. Thanks for pointing out the test.
> 
> Okay, I think it's as simple as the head unlinking prior to actually
> doing the del_gendisk creates a time when one scan work makes a new
> namespace before the stale one is deleted in a different scan. This
> should fix it:
> 
> ---
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 3f2f9b2be87c2..e8dbb6cb85694 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -4246,11 +4246,8 @@ static void nvme_ns_remove(struct nvme_ns *ns)
> 
>   	mutex_lock(&ns->ctrl->subsys->lock);
>   	list_del_rcu(&ns->siblings);
> -	if (list_empty(&ns->head->list)) {
> -		if (!nvme_mpath_queue_if_no_path(ns->head))
> -			list_del_init(&ns->head->entry);
> +	if (list_empty(&ns->head->list))
>   		last_path = true;
> -	}
>   	mutex_unlock(&ns->ctrl->subsys->lock);
>   
>   	/* guarantee not available in head->list */
> --
> 
> This opens a different race where Controller A is deleting the last
> path, and Controller B is bringing up a new namespace that reused the
> NSID. An earlier patch from me should fix that by reschudeling B's
> scan_work once A detects it removed the last path. It should work, but
> it feels a bit off to me, so I'll think about it a little more.
> 

I think we need to ensure that the head disk is fully removed from the 
system before a new head is created that may reuse the same NSID; 
otherwise we may run into name collisions. Since removal of the sysfs 
entries associated with the disk can be delayed, a namespace scan may 
attempt to create a new head with the same name while the previous one 
is still being torn down.

To avoid this, it might be better to defer removing head->entry from the 
subsys->nsheads list until after the head node’s gendisk has been 
removed in nvme_remove_head().

 From your first patch in this series, I see that the removal of
head->entry from subsys->nsheads was moved to the beginning of 
nvme_remove_head(). IMO, in addition to the change you proposed above, 
if we also move the removal of head->entry until after del_gendisk() is 
called, that should ensure the old disk is fully removed before a new 
head with the same NSID can be created.

Thanks,
--Nilay



More information about the Linux-nvme mailing list