[PATCH blktests] nvme/052: wait for namespace removal before recreating namespace
Nilay Shroff
nilay at linux.ibm.com
Tue Aug 20 09:55:36 PDT 2024
On 8/20/24 15:50, Shin'ichiro Kawasaki wrote:
> The CKI project reported that the test case nvme/052 fails occasionally
> with the errors below:
>
> nvme/052 (tr=loop) (Test file-ns creation/deletion under one subsystem) [failed]
> runtime ... 22.209s
> --- tests/nvme/052.out 2024-07-30 18:38:29.041716566 -0400
> +++
> +/mnt/tests/gitlab.com/redhat/centos-stream/tests/kernel/kernel-tests/-/archive/production/kernel-t\
> ests-production.zip/storage/blktests/nvme/nvme-loop/blktests
> +/results/nodev_tr_loop/nvme/052.out.bad 2024-07-30 18:45:35.438067452 -0400
> @@ -1,2 +1,4 @@
> Running nvme/052
> +cat: /sys/block/nvme1n2/uuid: No such file or directory
> +cat: /sys/block/nvme1n2/uuid: No such file or directory
> Test complete
>
> The test case repeats creating and removing namespaces. When the test
> case removes the namespace by echoing 0 to the sysfs enable file, this
> echo write does not wait for the completion of the namespace removal.
> Before the removal completes, the test case recreates the namespace.
> At this point, the sysfs uuid file for the old namespace still exists.
> The test case misunderstands that the the sysfs uuid file would be for
> the recreated namespace, and tries to read it. However, the removal
> process for the old namespace deletes the sysfs uuid file at this point.
> Then the read attempt fails and results in the errors.
>
> To avoid the failure, wait for the namespace removal before recreating
> the namespace. For this purpose, add the new helper function
> nvmf_wait_for_ns_removal(). To specify the namespace to wait for, get
> the name of the namespace from nvmf_wait_for_ns(), and pass it to
> nvmf_wait_for_ns_removal().
>
> The test case intends to catch the regression fixed by the kernel commit
> ff0ffe5b7c3c ("nvme: fix namespace removal list"). I reverted the commit
> from the kernel v6.11-rc4, then confirmed that the test case still can
> catch the regression with this change.
>
> Link: https://lore.kernel.org/linux-block/tczctp5tkr34o3k3f4dlyhuutgp2ycex6gdbjuqx4trn6ewm2i@qbkza3yr5wdd/
> Fixes: 077211a0e9ff ("nvme: add test for creating/deleting file-ns")
> Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki at wdc.com>
> ---
> Nelay, Yi, thank you for the feedbacks for the discussion
> thread at the Link. Here's the formal fix patch.
>
> tests/nvme/052 | 22 ++++++++++++++++++++--
> 1 file changed, 20 insertions(+), 2 deletions(-)
>
> diff --git a/tests/nvme/052 b/tests/nvme/052
> index cf6061a..e1ac823 100755
> --- a/tests/nvme/052
> +++ b/tests/nvme/052
> @@ -39,15 +39,32 @@ nvmf_wait_for_ns() {
> ns=$(_find_nvme_ns "${uuid}")
> done
>
> + echo "$ns"
> return 0
> }
>
> +nvmf_wait_for_ns_removal() {
> + local ns=$1 i
> +
> + for ((i = 0; i < 10; i++)); do
> + if [[ ! -e /dev/$ns ]]; then
> + return
> + fi
> + sleep .1
> + echo "wait removal of $ns" >> "$FULL"
> + done
> +
> + if [[ -e /dev/$ns ]]; then
> + echo "Failed to remove the namespace $ns"
> + fi
> +}
> +
Under nvmf_wait_for_ns_removal(), instead of checking the existence of "/dev/$ns",
how about checking the existence of file "/sys/block/$ns"? As we know, when this issue
manifests, we have a stale entry "/sys/block/$ns/$uuid" lurking around from the
previous iteration for sometime causing the observed symptom. So I think, we may reuse the
_find_nvme_ns() function to wait until the stale "/sys/block/$ns/$uuid" file
exists. Maybe something like below:
nvmf_wait_for_ns_removal() {
local ns
local timeout="5"
local uuid="$1"
ns=$(_find_nvme_ns "${uuid}")
start_time=$(date +%s)
while [[ ! -z "$ns" ]]; do
sleep 1
end_time=$(date +%s)
if (( end_time - start_time > timeout )); then
echo "namespace with uuid \"${uuid}\" still " \
"not deleted within ${timeout} seconds"
return 1
fi
echo "Waiting for $ns removal" >> ${FULL}
ns=$(_find_nvme_ns "${uuid}")
done
return 0
}
Thanks,
--Nilay
More information about the Linux-nvme
mailing list