[PATCH V3 0/3] Ensure ordered namespace registration during async scan

Thu Feb 26 00:07:10 PST 2026

On Wed Feb 25, 2026 at 10:41 PM CET, Keith Busch wrote:
> On Wed, Feb 25, 2026 at 05:12:00PM +0100, Maurizio Lombardi wrote:
>> The NVMe fully asynchronous namespace scanning introduced in
>> commit 4e893ca81170 ("nvme-core: scan namespaces asynchronously")
>> significantly improved discovery times. However, it also introduced
>> non-deterministic ordering for namespace registration.
>>
>> While kernel device names (/dev/nvmeXnY) are not guaranteed to be stable
>> across reboots, this unpredictable ordering has caused considerable user
>> confusion and has been perceived as a regression, leading to multiple bug
>> reports.
>
> The nvme-pci driver also probes the controllers asynchronously, which
> can also create non-determinisitic names. Is that part not a problem?

Potentially, it is. The difference is that so far no one ever complained
about it, while with namespace async scanning we immediately received regression
reports, to the point we had to revert the changes and restore the
sequential namespaces scan in RHEL.

>
> Just on the suffix part of the namespace's block handle, I have a
> potential alternate suggestion here. The instance names pulled from the
> ida guarantee we'll always have unique names for the lifetime of the
> backing kobject. I introduced that a while ago, but I'm testing this out
> now and it seems kobject_del is sufficient to reuse that name. The
> driver already did that to all the objects when deleting the namespace,
> so there doesn't appear to be a reason to wait for the final
> kobject_put.
>
> What I'm saying is I may have been mistaken about the naming collision
> issues and we can just use the head's ns_id to get a consistent and
> meaningful name based off the backing namespaces. There's some unlikely
> races with multipath at the moment if we did use ns_id, but I think
> they're all fixable.

Ok, so you'd like to use the namespace's NSID as the suffix.
I also considered this approach, the reason I didn't implemented
it is that I wished to have the async namespace scan performance improvements
while preserving the same enumeration we had for years with the sequential scan:

Before the introduction of the async scan, /dev/nvme0n1 always pointed
to the first entry of the NSID list, /dev/nvme0n2 to the second
entry and so on.

With your proposal, if a user has sparse NSIDs (1, 10, 333)
then he will get /dev/nvme0n1, /dev/nvme0n10, /dev/nvme0n333.
On one hand, yes, they are "more stable" and more meaningful too,
on the other hand this breaks the assumption of contiguous naming.
This might not be a problem for the mainline kernel, but I suspect we
will have people complaining again that the /dev/nvmeXnY enumeration changed

Maurizio