race between nvme device creation and discovery?
Daniel Wagner
dwagner at suse.de
Sun Feb 4 23:47:21 PST 2024
On Mon, Feb 05, 2024 at 02:02:04PM +0900, Hannes Reinecke wrote:
> Hehe. Good old sysfs.
> This is a common issue with sysfs, and we've even had a retry loop in udev
> back in them days to avoid these kind of things.
>
> Point is, uevent will be sent out with device_add(), causing udev to run,
> running udev rules, and eventually call into libnvme to scan the device. But
> as you rightly pointed out, the sysfs link is only created
> _after_ the event has been sent, so there's a race window during which
> libnvme will fails to read the link, landing us with the scenario above.
>
> While we could add a retry logic to libnvme, I'm not really convinced
> this is a good idea; in the end, who's to tell how long we should wait?
> A second? Several seconds? A minute? Several minutes?
> Also not that sysfs_create_link() has a return code, so the link might
> not be created at all ...
Yep, retry logics have always a smell too it. What about the idea to
add some sort of ready attribute to the controller sysfs entry?
> A possibly better way here would be to suppress uevents on device_add(),
> and only send out events once the device is fully set up, ie just before
> the 'return 0'.
I don't think this will address the problem. The blktests runs completely
independent to what udev does and it's blktests which observes the
missing link not udev.
More information about the Linux-nvme
mailing list