[bug report] blktests nvme/047 failed due to /dev/nvme0n1 not created in time

Shinichiro Kawasaki shinichiro.kawasaki at wdc.com
Wed Aug 9 17:19:39 PDT 2023


On Aug 09, 2023 / 18:50, Daniel Wagner wrote:
> On Tue, Aug 08, 2023 at 10:46:46AM +0200, Daniel Wagner wrote:
> > On Fri, Aug 04, 2023 at 06:33:04PM +0800, Yi Zhang wrote:
> > > On Tue, Aug 1, 2023 at 7:28 PM Yi Zhang <yi.zhang at redhat.com> wrote:
> > > > After some investigating, I found it was due to the /dev/nvme0n1 node
> > > > couldn't be created in time which lead to the following fio failing.
> > > > + nvme connect -t tcp -a 127.0.0.1 -s 4420 -n blktests-subsystem-1
> > > > --hostnqn=nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349
> > > > --hostid=0f01fb42-9f7f-4856-b0b3-51e60b8de349 --nr-write-queues=1
> > > > + ls -l /dev/nvme0 /dev/nvme-fabrics
> > > > crw-------. 1 root root 234,   0 Aug  1 05:50 /dev/nvme0
> > > > crw-------. 1 root root  10, 122 Aug  1 05:50 /dev/nvme-fabrics
> > > > + '[' '!' -b /dev/nvme0n1 ']'
> > > > + echo '/dev/nvme0n1 node still not created'
> > > > dmesg:
> > > > [ 1840.413396] loop0: detected capacity change from 0 to 10485760
> > > > [ 1840.934379] nvmet: adding nsid 1 to subsystem blktests-subsystem-1
> > > > [ 1841.018766] nvmet_tcp: enabling port 0 (127.0.0.1:4420)
> > > > [ 1846.782615] nvmet: creating nvm controller 1 for subsystem
> > > > blktests-subsystem-1 for NQN
> > > > nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349.
> > > > [ 1846.808392] nvme nvme0: creating 33 I/O queues.
> > > > [ 1846.874298] nvme nvme0: mapped 1/32/0 default/read/poll queues.
> > > > [ 1846.945334] nvme nvme0: new ctrl: NQN "blktests-subsystem-1", addr
> > > > 127.0.0.1:4420
> > 
> > Not really sure how the blk device registration code works, but this
> > looks like there something executed not in the same context as the
> > nvme-cli command and thus we might return to userspace before the device
> > is fully created. And there is also udev events which are handled by
> > systemd. If this is the case, we might want to add some generic helper
> > which waits for the device to pop up before we continue with the test.
> 
> After looking a bit at nvme/010 I see why does tests are not failing
> in the same way as nvme/047. After connecting _find_nvme_dev is used
> to wait for the device to appear:

Thanks. _find_nvme_dev() looks the key to solve the failure. I took a look in
the function and observed that:

- nvme/047 also calls _find_nvme_dev().
- To be precise, _find_nvme_dev() waits for /sys/block/$dev/uuid and
  /sys/block/$dev/wwid to appear. It does not wait for the device to appear.

To wait for the device to appear, the change below will work, probably.

Yi, could you try and see if it avoids the failure?

diff --git a/tests/nvme/rc b/tests/nvme/rc
index 4f3a994..005db80 100644
--- a/tests/nvme/rc
+++ b/tests/nvme/rc
@@ -740,7 +740,7 @@ _find_nvme_dev() {
 		if [[ "$subsysnqn" == "$subsys" ]]; then
 			echo "$dev"
 			for ((i = 0; i < 10; i++)); do
-				if [[ -e /sys/block/$dev/uuid &&
+				if [[ -e /dev/$dev && -e /sys/block/$dev/uuid &&
 					-e /sys/block/$dev/wwid ]]; then
 					return
 				fi



More information about the Linux-nvme mailing list