blktests failures with v6.11-rc1 kernel

Mon Aug 19 05:34:51 PDT 2024

On Aug 14, 2024 / 18:05, Nilay Shroff wrote:
> 
> 
> On 8/13/24 12:36, Yi Zhang wrote:
> > On Sat, Aug 3, 2024 at 12:49 AM Nilay Shroff <nilay at linux.ibm.com> wrote:
> > 
> > There are no simultaneous tests during the CKI tests running.
> > I reproduced the failure on that server and always can be reproduced
> > within 5 times:
> > # sh a.sh
> > ==============================0
> > nvme/052 (tr=loop) (Test file-ns creation/deletion under one subsystem) [passed]
> >     runtime  21.496s  ...  21.398s
> > ==============================1
> > nvme/052 (tr=loop) (Test file-ns creation/deletion under one subsystem) [failed]
> >     runtime  21.398s  ...  21.974s
> >     --- tests/nvme/052.out 2024-08-10 00:30:06.989814226 -0400
> >     +++ /root/blktests/results/nodev_tr_loop/nvme/052.out.bad
> > 2024-08-13 02:53:51.635047928 -0400
> >     @@ -1,2 +1,5 @@
> >      Running nvme/052
> >     +cat: /sys/block/nvme1n2/uuid: No such file or directory
> >     +cat: /sys/block/nvme1n2/uuid: No such file or directory
> >     +cat: /sys/block/nvme1n2/uuid: No such file or directory
> >      Test complete
> > # uname -r
> > 6.11.0-rc3
> 
> We may need to debug this further. Is it possible to patch blktest and 
> collect some details when this issue manifests? If yes then can you please
> apply the below diff and re-run your test? This patch would capture output 
> of "nvme list" and "sysfs attribute tree created under namespace head node"
> and store those details in 052.full file. 
> 
> diff --git a/common/nvme b/common/nvme
> index 9e78f3e..780b5e3 100644
> --- a/common/nvme
> +++ b/common/nvme
> @@ -589,8 +589,23 @@ _find_nvme_ns() {
>                 if ! [[ "${ns}" =~ nvme[0-9]+n[0-9]+ ]]; then
>                         continue
>                 fi
> +               echo -e "\nBefore ${ns}/uuid check:\n" >> ${FULL}
> +               echo -e "\n`nvme list -v`\n" >> ${FULL}
> +               echo -e "\n`tree ${ns}`\n" >> ${FULL}
> +
>                 [ -e "${ns}/uuid" ] || continue
>                 uuid=$(cat "${ns}/uuid")
> +
> +               if [ "$?" = "1" ]; then
> +                       echo -e "\nFailed to read $ns/uuid\n" >> ${FULL}
> +                       echo "`nvme list -v`" >> ${FULL}
> +                       if [ -d "${ns}" ]; then
> +                               echo -e "\n`tree ${ns}`\n" >> ${FULL}
> +                       else
> +                               echo -e "\n${ns} doesn't exist!\n" >> ${FULL}
> +                       fi
> +               fi
> +
>                 if [[ "${subsys_uuid}" == "${uuid}" ]]; then
>                         basename "${ns}"
>                 fi
> 
> 
> After applying the above diff, when this issue occurs on your system copy this 
> file "</path/to/blktests>/results/nodev_tr_loop/nvme/052.full" and send it across. 
> This may give us some clue about what might be going wrong. 

Nilay, thank you for this suggestion. To follow it, I tried to recreate the
failure again, and managed to do it :) When I repeat the test case 20 or 40
times one of my test machines, the failure is observed in stable manner.

I applied your debug patch above to blktests, then I repeated the test case.
Unfortunately, the failure disappeared. When I repeat the test case 100 times,
the failure was not observed. I guess the echos for debug changed the timing to
access the sysfs uuid file, then the failure disappeared.

This helped me think about the cause. The test case repeats _create_nvmet_ns
and _remove_nvmet_ns. Then, it repeats creating and removing the sysfs uuid
file. I guess when _remove_nvmet_ns echos 0 to ${nvemt_ns_path}/enable to
remove the namespace, it does not wait for the completion of the removal work.
Then, when _find_nvme_ns() checks existence of the sysfs uuid file, it refers to
the sysfs uuid file that the previous _remove_nvmet_ns left. When it does cat
to the sysfs uuid file, it fails because the sysfs uuid file has got removed,
before recreating it for the next _create_nvmet_ns.

Based on this guess, I created a patch below. It modifies the test case to wait
for the namespace device disappears after calling _remove_nvmet_ns. (I assume
that the sysfs uuid file disappears when the device file disappears). With
this patch, the failure was not observed by repeating it 100 times. I also
reverted the kernel commit ff0ffe5b7c3c ("nvme: fix namespace removal list")
from v6.11-rc4, then confirmed that the test case with this change still can
detect the regression.

I will do some more confirmation. If it goes well, will post this change as
a formal patch.

diff --git a/tests/nvme/052 b/tests/nvme/052
index cf6061a..469cefd 100755
--- a/tests/nvme/052
+++ b/tests/nvme/052
@@ -39,15 +39,32 @@ nvmf_wait_for_ns() {
 		ns=$(_find_nvme_ns "${uuid}")
 	done
 
+	echo "$ns"
 	return 0
 }
 
+nvmf_wait_for_ns_removal() {
+	local ns=$1 i
+
+	for ((i = 0; i < 10; i++)); do
+		if [[ ! -e /dev/$ns ]]; then
+			return
+		fi
+		sleep .1
+		echo "wait removal of $ns" >> "$FULL"
+	done
+
+	if [[ -e /dev/$ns ]]; then
+		echo "Failed to remove the namespace $"
+	fi
+}
+
 test() {
 	echo "Running ${TEST_NAME}"
 
 	_setup_nvmet
 
-	local iterations=20
+	local iterations=20 ns
 
 	_nvmet_target_setup
 
@@ -63,7 +80,7 @@ test() {
 		_create_nvmet_ns "${def_subsysnqn}" "${i}" "$(_nvme_def_file_path).$i" "${uuid}"
 
 		# wait until async request is processed and ns is created
-		nvmf_wait_for_ns "${uuid}"
+		ns=$(nvmf_wait_for_ns "${uuid}")
 		if [ $? -eq 1 ]; then
 			echo "FAIL"
 			rm "$(_nvme_def_file_path).$i"
@@ -71,6 +88,7 @@ test() {
 		fi
 
 		_remove_nvmet_ns "${def_subsysnqn}" "${i}"
+		nvmf_wait_for_ns_removal "$ns"
 		rm "$(_nvme_def_file_path).$i"
 	}
 	done