race between nvme device creation and discovery?

Daniel Wagner dwagner at suse.de
Tue Feb 6 04:45:36 PST 2024


On Mon, Feb 05, 2024 at 06:13:13PM +0100, Daniel Wagner wrote:
> > > We have a 'state' attribute in sysfs already, though the
> > > NVME_CTRL_STARTED_ONCE is modelled as flag. Not sure how this fits
> > > together yet, so let me play with this.
> > 
> > If I got this right, we could just filter out all controllers which are
> > in the NEW state, because all transports trigger the connect code right
> > after the NEW state. At this point the sysfs should be completely
> > populated.
> 
> I've patched libnvme to print the state of the controller right
> before the lookup for the subsystem happens and it says:
> 
>     +state connecting
>     +failed to lookup subsystem for controller nvme1
> 
> more digging needed...

So after a lot more debugging and testing I think the real solution is
just not to issue the error message in libnvme. It this case we just do
add the device to the tree. And looking the code we have already
other error returns in this function which issue nothing:


nvme_scan_ctrl()
{
	[...]

	subsysnqn = nvme_get_attr(path, "subsysnqn");
	if (!subsysnqn) {
		errno = ENXIO;
		return NULL;
	}
	subsysname = nvme_ctrl_lookup_subsystem_name(r, name);
	if (!subsysname) {
		nvme_msg(r, LOG_ERR,
			 "failed to lookup subsystem for controller %s\n",
			 name);
		errno = ENXIO;
		return NULL;
	}
	s = nvme_lookup_subsystem(h, subsysname, subsysnqn);

	if (!s) {
		errno = ENOMEM;
		return NULL;
	}

	[...]
}

Changing it from LOG_ERR to LOG_DEBUG makes the tests happy. No problems
observed.



More information about the Linux-nvme mailing list