[PATCH blktests] tests/nvme/031: fix connecting faiure

Shinichiro Kawasaki shinichiro.kawasaki at wdc.com
Mon Sep 11 04:00:17 PDT 2023


On Sep 08, 2023 / 15:09, Daniel Wagner wrote:
> On Fri, Sep 08, 2023 at 11:26:31AM +0000, Shinichiro Kawasaki wrote:
> > On Sep 07, 2023 / 11:44, Yi Zhang wrote:
> > > allow_any_host was disabled during _create_nvmet_subsystem, call
> > > _create_nvmet_host before connecting to allow the host to connect.
> > >
> > > [76096.420586] nvmet: adding nsid 1 to subsystem blktests-subsystem-0
> > > [76096.440595] nvmet_tcp: enabling port 0 (127.0.0.1:4420)
> > > [76096.491344] nvmet: connect by host nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349 for subsystem blktests-subsystem-0 not allowed
> > > [76096.505049] nvme nvme2: Connect for subsystem blktests-subsystem-0 is not allowed, hostnqn: nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349
> > > [76096.519609] nvme nvme2: failed to connect queue: 0 ret=16772
> > >
> > > Signed-off-by: Yi Zhang <yi.zhang at redhat.com>
> 
> Reviewed-by: Daniel Wagner <dwagner at suse.de>
> 
> > Thanks for the catching this. I looked back the past changes and found that the
> > commit c32b233b7dd6 ("nvme/rc: Add helper for adding/removing to allow list")
> > triggered the connection failure. So, I think a Fixes tag with this commit is
> > required (I can add when this patch is applied).

I've applied the fix. Thanks!

> >
> > Even after the commit, the test case still passes. That's why I did not notice
> > the connection failure. I think _nvme_connect_subsys() should check exit status
> > of "nvme connect" command and print an error message on failure. This will help
> > to catch similar connection failures in future.
> 
> I was running into a similiar problem for (not yet existing) nvme/050
> test case [1]:
> 
> nvmf_wait_for_state() {
>        local def_state_timeout=5
>        local subsys_name="$1"
>        local state="$2"
>        local timeout="${3:-$def_state_timeout}"
>        local nvmedev
>        local state_file
>        local start_time
>        local end_time
> 
>        nvmedev=$(_find_nvme_dev "${subsys_name}")
>        state_file="/sys/class/nvme-fabrics/ctl/${nvmedev}/state"
> 
>        start_time=$(date +%s)
>        while ! grep -q "${state}" "${state_file}"; do
>                sleep 1
>                end_time=$(date +%s)
>                if (( end_time - start_time > timeout )); then
>                        echo "expected state \"${state}\" not " \
>                                "reached within ${timeout} seconds"
>                        return 1
>                fi
>        done
> 
>        return 0
> }
> 
> _nvme_connect_subsys "${nvme_trtype}" "${def_subsysnqn}" \
>                             --hostnqn "${def_hostnqn}" \
>                             --reconnect-delay 1 \
>                             --ctrl-loss-tmo 10
> 
> nvmf_wait_for_state "${def_subsysnqn}" "live"
> nvmedev=$(_find_nvme_dev "${def_subsysnqn}")
> 
> We could make this a bit more generic and move it into the connect
> helper. What do you think?

This nvme state file check looks a bit complicated, but looks more robust than
"nvme connect" command exit status check. Do you think that "nvme connect" can
fail even when "nvme connect" command returns good status? If so, your approach
will be the way to choose.

> 
> [1] https://lore.kernel.org/linux-nvme/20230621155825.20146-2-dwagner@suse.de/


More information about the Linux-nvme mailing list