[PATCH blktests v3] nvme/046: test queue count changes on reconnect
Daniel Wagner
dwagner at suse.de
Wed Sep 14 04:07:17 PDT 2022
On Wed, Sep 14, 2022 at 01:37:38PM +0300, Sagi Grimberg wrote:
>
> > > > > FYI, each blktests test case can define DMESG_FILTER not to fail with specific
> > > > > keywords in dmesg. Test cases meta/011 and block/028 are reference use
> > > > > cases.
> > > >
> > > > Ah okay, let me look into it.
> > >
> > > So I made the state read function a bit more robust (test if state file
> > > exists) and the it turns out this made rdma happy(??) but tcp is still
> > > breaking.
> >
> > s/tcp/fc/
> >
> > On closer inspection I see following sequence for fc:
> >
> > [399664.863585] nvmet: connect request for invalid subsystem blktests-subsystem-1!
> > [399664.863704] nvme nvme0: Connect Invalid Data Parameter, subsysnqn "blktests-subsystem-1"
> > [399664.863758] nvme nvme0: NVME-FC{0}: reset: Reconnect attempt failed (16770)
> > [399664.863784] nvme nvme0: NVME-FC{0}: reconnect failure
> > [399664.863837] nvme nvme0: Removing ctrl: NQN "blktests-subsystem-1"
> >
> > When the host tries to reconnect to a non existing controller (the test
> > called _remove_nvmet_subsystem_from_port()) the target returns 0x4182
> > (NVME_SC_DNR|NVME_SC_READ_ONLY(?)).
>
> That is not something that the target is supposed to be doing, I have no
> idea why this is sent. Perhaps this is something specific to the fc
> implementation?
Okay, I'll look into this.
> So arguably fc behaves correct by
> > stopping the reconnects. tcp and rdma just ignore the DNR.
>
> DNR means do not retry the command, it says nothing about do not attempt
> a future reconnect...
That makes sense.
> > If we agree that the fc behavior is the right one, then the nvmet code
> > needs to be changed so that when the qid_max attribute changes it forces
> > a reconnect. The trick with calling _remove_nvmet_subsystem_from_port()
> > to force a reconnect is not working. And tcp/rdma needs to honor the
> > DNR.
>
> tcp/rdma honor DNR afaik.
I did interpret DNR wrongly. As you pointed out it's just about the
command not about the reconnect attempt.
So do we agree the fc host should not stop reconnecting? James?
More information about the Linux-nvme
mailing list