[PATCH blktests] nvme/068: add a test for multipath delayed removal
Shinichiro Kawasaki
shinichiro.kawasaki at wdc.com
Thu Apr 16 05:50:19 PDT 2026
On Apr 15, 2026 / 10:41, John Garry wrote:
> For NVMe multipath, the delayed removal feature allows the multipath
> gendisk to remain present when all available paths are gone. The purpose of
> this feature is to ensure that we keep the gendisk for intermittent path
> failures.
>
> The delayed removal works on a timer - when all paths are gone, a timer is
> kicked off; once the timer expires and no paths have returned, the gendisk
> is removed.
>
> When all paths are gone and the gendisk is still present, all reads and
> writes to the disk are queued. If a path returns before the timer
> expiration, the timer canceled and the queued IO is submitted;
> otherwise they fail when the timer expires.
>
> This testcase covers two scenarios in separate parts:
> a. test that IOs submitted after all paths are removed (and do not return)
> fail
> b. test that IOs submitted between all paths removed and a path
> returning succeed
>
> During the period of the timer being active, it must be ensured that the
> nvme-core module is not removed. Otherwise the driver may not be present
> to handle the timeout expiry. The kernel ensures this by taking a
> reference to the module. Ideally, we would try to remove the module during
> this test to prove that this is not possible (and the kernel behaves as
> expected), but that module will probably not be removable anyway due to
> many references. To test this feature, check that the refcount of the
> nvme-core module is incremented when the delayed timer is active.
>
> Signed-off-by: John Garry <john.g.garry at oracle.com>
John, thanks for the patch. When I ran the new test case in my test environment,
it failed. The reported refcount mismatch looks happening in the Part b.
nvme/068 (tr=loop) (NVMe multipath delayed removal test) [failed]
runtime 38.579s ... 38.770s
--- tests/nvme/068.out 2026-04-16 20:50:21.228000000 +0900
+++ /home/shin/Blktests/blktests/results/nodev_tr_loop/nvme/068.out.bad 2026-04-16 21:30:36.215000000 +0900
@@ -1,3 +1,4 @@
Running nvme/068
pwrite: Input/output error
+module refcount not as original
Test complete
I have no idea why it fails. Do you have any guess about the failure cause?
Also, please find my review comments in line.
>
> diff --git a/common/rc b/common/rc
> index 5350057..6eae0e2 100644
> --- a/common/rc
> +++ b/common/rc
> @@ -117,6 +117,16 @@ _module_not_in_use() {
> fi
> }
>
> +_module_use_count() {
> + local refcnt
> + if [ -f "/sys/module/$1/refcnt" ]; then
> + refcnt="$(cat /sys/module/"$1"/refcnt)"
> + echo $refcnt
To suppress Shellcheck warning, please add quotation marks: "$refcnt".
> + return
> + fi
> + echo ""
> +}
> +
> _have_module_param() {
> _have_driver "$1" || return
>
> diff --git a/tests/nvme/068 b/tests/nvme/068
> new file mode 100644
File mode 755 is recommended for consistency.
> index 0000000..e06fd6b
> --- /dev/null
> +++ b/tests/nvme/068
> @@ -0,0 +1,118 @@
> +#!/bin/bash
> +# SPDX-License-Identifier: GPL-3.0+
> +# Copyright (C) 2026 John Garry
> +#
> +# Test NVMe multipath delayed removal works as expected
> +
> +. tests/nvme/rc
> +. common/xfs
> +
> +DESCRIPTION="NVMe multipath delayed removal test"
> +QUICK=1
It is guided to set QUICK=1 for the test cases which completes
"in ~30 seconds or less". In my environment, this test case took 38
seconds, so I'm not so sure if this test case is quick. How long
does it take in your environment?
> +
> +requires() {
> + _nvme_requires
> + _have_loop
> + _have_module_param_value nvme_core multipath Y
> + _require_nvme_trtype_is_fabrics
> +}
> +
> +set_conditions() {
> + _set_nvme_trtype "$@"
> +}
> +
> +_delayed_nvme_reconnect_ctrl() {
> + sleep 5
> + _nvme_connect_subsys
> +}
> +
> +test() {
> + echo "Running ${TEST_NAME}"
> +
> + _setup_nvmet
> +
> + local nvmedev
> + local ns
> + local bytes_written
> + local refcnt_orig
> + local refcnt
> + _nvmet_target_setup
> +
> + _nvme_connect_subsys
> +
> + # Part a: Prove that writes fail when no path returns. Any reads or
> + # writes are queued during the delayed removal period. If no
> + # paths return before the timer expires, then those IOs should
> + # fail.
> + # During the delayed removal period, ensure that the module
> + # refcnt is incremented, to prove that we cannot remove the
> + # driver during this period.
> + nvmedev=$(_find_nvme_dev "${def_subsysnqn}")
> + ns=$(_find_nvme_ns "${def_subsys_uuid}")
> + refcnt=$(_module_use_count nvme_core)
> + echo 10 > "/sys/block/"$ns"/delayed_removal_secs"
Shellcheck complains about the line above. I think it can be modified as below:
echo 10 > "/sys/block/${ns}/delayed_removal_secs"
> + refcnt_orig=$(_module_use_count nvme_core)
> + _nvme_disconnect_ctrl "${nvmedev}"
> + sleep 1
> + ns=$(_find_nvme_ns "${def_subsys_uuid}")
> + if [[ "${ns}" = "" ]]; then
> + echo "could not find ns after disconnect"
> + fi
> + refcnt=$(_module_use_count nvme_core)
> + if [ "$refcnt" != "" ] && [ "$refcnt" -le "$refcnt_orig" ]; then
> + echo "module refcount did not increase"
> + fi
> + bytes_written=$(run_xfs_io_pwritev2 /dev/"$ns" 4096)
> + if [ "$bytes_written" == 4096 ]; then
> + echo "wrote successfully after disconnect"
> + fi
> + sleep 10
> + ns=$(_find_nvme_ns "${def_subsys_uuid}")
> + if [[ !"${ns}" = "" ]]; then
Shellcheck warns the line above. I guess it can be as follows:
if [[ "${ns}" != "" ]]; then
> + echo "found ns after delayed removal"
> + fi
> + refcnt=$(_module_use_count nvme_core)
> + if [ "$refcnt" != "" ] && [ "$refcnt" -ne "$refcnt_orig" ]; then
> + echo "module refcount not as original"
> + fi
> +
> + # Part b: Ensure writes for an intermittent disconnect are successful.
> + # During an intermittent disconnect, any reads or writes
> + # queued should succeed after a path returns.
> + # Also ensure module refcount behaviour is as expected, as
> + # above.
> + _nvme_connect_subsys
> +
> + nvmedev=$(_find_nvme_dev "${def_subsysnqn}")
> + ns=$(_find_nvme_ns "${def_subsys_uuid}")
> + refcnt_orig=$(_module_use_count nvme_core)
> + echo 10 > "/sys/block/"$ns"/delayed_removal_secs"
Again, the line above can be modified as follows for Shellcheck.
echo 10 > "/sys/block/${ns}/delayed_removal_secs"
More information about the Linux-nvme
mailing list