[Question] nvme namespace enumerated to 2 when hot-reinserted into a server in 5.14 kernel(RHEL 9.1)

Munoz Ruiz, Francisco francisco.munoz.ruiz at linux.intel.com
Tue Oct 18 16:24:56 PDT 2022



On 10/18/2022 2:04 PM, Keith Busch wrote:
> On Tue, Oct 18, 2022 at 01:22:01PM -0700, Munoz Ruiz, Francisco wrote:
>> Hi,
>>
>>
>> We are hardening our tests for RHEL, and we'd like to know if this is the
>> expected behavior.
>>
>> After a couple of tests via trace-cmd and booting with
>> nvme_core.multipath=N, we found that the number "2" presented to userspace
>> comes from nvme_init_ns_head()->nvme_alloc_ns_head() when an SSD is
>> hot-reinserted
>>
>>
>> ret = ida_simple_get(&ctrl->subsys->ns_ida, 1, 0, GFP_KERNEL);
>> if (ret < 0)
>> 	goto out_free_head;
>> head->instance = ret;
>>
>> Then in nvme_alloc_ns() the string representing a disk name is assembled via
>> sprintf because nvme_mpath_set_disk_name returns immediately due to
>> multipath == false
>> /*
>>   * Without the multipath code enabled, multiple controller per
>>   * subsystems are visible as devices and thus we cannot use the
>>   * subsystem instance.
>>   */
>> if (!nvme_mpath_set_disk_name(ns, disk->disk_name, &disk->flags))
>> 	sprintf(disk->disk_name, "nvme%dn%d", ctrl->instance,
>> 		ns->head->instance);
>> ns->disk = disk;
>>
>>
>> We also observed that the ida API used in core.c was updated in a recent
>> commit 8b850475c08caa9545c460d7. Does this fix the namespace numbering
>> presented to the user via disk_name? if not, should we pass our tests if
>> nvmeXn2 is observed instead of nvmeXn1 when hot-reinserting an SSD?
> 
> If the last number is incrementing, that means something is holding a
> reference to the older one that you removed.
> 
> Assuming a reference is being held, a hot remove-insert sequence would
> expect the controller holding the namespace to be removed. The
> subsequent insertion would have the controller enumerate with a new
> instance number instead.
In our test we are reinserting the SSD in a new PCI slot and this 
enumerates the controller with a new instance like you said, however we 
see namespace 2. Ex: nvme0n1 becomes nvme9n2
  Ex, if the namespace was nvme0n1 before, we'd
> get nvme1n1 next time instead of nvme0n2. Or are you just detaching the
> namespace without detaching the controller?
We are not running any nvme-cli command(like detach-ns) before 
mechanical remove. (Sorry, I don't know if this is what you asked)



More information about the Linux-nvme mailing list