[bug report] NVMe/IB: reset_controller need more than 1min

Yi Zhang yi.zhang at redhat.com
Fri Dec 10 19:01:56 PST 2021


On Fri, Jun 25, 2021 at 12:14 AM Yi Zhang <yi.zhang at redhat.com> wrote:
>
> On Thu, Jun 24, 2021 at 5:32 AM Sagi Grimberg <sagi at grimberg.me> wrote:
> >
> >
> > > Hello
> > >
> > > Gentle ping here, this issue still exists on latest 5.13-rc7
> > >
> > > # time nvme reset /dev/nvme0
> > >
> > > real 0m12.636s
> > > user 0m0.002s
> > > sys 0m0.005s
> > > # time nvme reset /dev/nvme0
> > >
> > > real 0m12.641s
> > > user 0m0.000s
> > > sys 0m0.007s
> >
> > Strange that even normal resets take so long...
> > What device are you using?
>
> Hi Sagi
>
> Here is the device info:
> Mellanox Technologies MT27700 Family [ConnectX-4]
>
> >
> > > # time nvme reset /dev/nvme0
> > >
> > > real 1m16.133s
> > > user 0m0.000s
> > > sys 0m0.007s
> >
> > There seems to be a spurious command timeout here, but maybe this
> > is due to the fact that the queues take so long to connect and
> > the target expires the keep-alive timer.
> >
> > Does this patch help?
>
> The issue still exists, let me know if you need more testing for it. :)

Hi Sagi
ping, this issue still can be reproduced on the latest
linux-block/for-next, do you have a chance to recheck it, thanks.


>
>
> > --
> > diff --git a/drivers/nvme/target/fabrics-cmd.c
> > b/drivers/nvme/target/fabrics-cmd.c
> > index 7d0f3523fdab..f4a7db1ab3e5 100644
> > --- a/drivers/nvme/target/fabrics-cmd.c
> > +++ b/drivers/nvme/target/fabrics-cmd.c
> > @@ -142,6 +142,14 @@ static u16 nvmet_install_queue(struct nvmet_ctrl
> > *ctrl, struct nvmet_req *req)
> >                  }
> >          }
> >
> > +       /*
> > +        * Controller establishment flow may take some time, and the
> > host may not
> > +        * send us keep-alive during this period, hence reset the
> > +        * traffic based keep-alive timer so we don't trigger a
> > +        * controller teardown as a result of a keep-alive expiration.
> > +        */
> > +       ctrl->reset_tbkas = true;
> > +
> >          return 0;
> >
> >   err:
> > --
> >
> > >> target:
> > >> [  934.306016] nvmet: creating controller 1 for subsystem testnqn for
> > >> NQN nqn.2014-08.org.nvmexpress:uuid:4c4c4544-0056-4c10-8058-b7c04f383432.
> > >> [  944.875021] nvmet: ctrl 1 keep-alive timer (5 seconds) expired!
> > >> [  944.900051] nvmet: ctrl 1 fatal error occurred!
> > >> [ 1005.628340] nvmet: creating controller 1 for subsystem testnqn for
> > >> NQN nqn.2014-08.org.nvmexpress:uuid:4c4c4544-0056-4c10-8058-b7c04f383432.
> > >>
> > >> client:
> > >> [  857.264029] nvme nvme0: resetting controller
> > >> [  864.115369] nvme nvme0: creating 40 I/O queues.
> > >> [  867.996746] nvme nvme0: mapped 40/0/0 default/read/poll queues.
> > >> [  868.001673] nvme nvme0: resetting controller
> > >> [  935.396789] nvme nvme0: I/O 9 QID 0 timeout
> > >> [  935.402036] nvme nvme0: Property Set error: 881, offset 0x14
> > >> [  935.438080] nvme nvme0: creating 40 I/O queues.
> > >> [  939.332125] nvme nvme0: mapped 40/0/0 default/read/poll queues.
> >
>
>
> --
> Best Regards,
>   Yi Zhang



--
Best Regards,
  Yi Zhang




More information about the Linux-nvme mailing list