[bug report] NVMe/IB: reset_controller need more than 1min

Mon Feb 28 16:06:07 PST 2022

On Wed, Feb 23, 2022 at 6:04 PM Max Gurtovoy <mgurtovoy at nvidia.com> wrote:
>
> Hi Yi Zhang,
>
> thanks for testing the patches.
>
> Can you provide more info on the time it took with both kernels ?

Hi Max
Sorry for the late response, here are the test results/dmesg on
debug/non-debug kernel with your patch:
debug kernel: timeout
# time nvme connect -t rdma -a 172.31.0.202 -s 4420 -n testnqn
real    0m16.956s
user    0m0.000s
sys     0m0.237s
# time nvme reset /dev/nvme0
real    1m33.623s
user    0m0.000s
sys     0m0.024s
# time nvme disconnect-all
real    1m26.640s
user    0m0.000s
sys     0m9.969s

host dmesg:
https://pastebin.com/8T3Lqtkn
target dmesg:
https://pastebin.com/KpFP7xG2

non-debug kernel: no timeout issue, but still 12s for reset, and 8s
for disconnect
host:
# time nvme connect -t rdma -a 172.31.0.202 -s 4420 -n testnqn

real    0m4.579s
user    0m0.000s
sys     0m0.004s
# time nvme reset /dev/nvme0

real    0m12.778s
user    0m0.000s
sys     0m0.006s
# time nvme reset /dev/nvme0

real    0m12.793s
user    0m0.000s
sys     0m0.006s
# time nvme reset /dev/nvme0

real    0m12.808s
user    0m0.000s
sys     0m0.006s
# time nvme disconnect-all

real    0m8.348s
user    0m0.000s
sys     0m0.189s

>
> The patches don't intend to decrease this time but re-start the KA in
> early stage - as soon as we create the AQ.
>
> I guess we need to debug it offline.
>
> On 2/21/2022 12:00 PM, Yi Zhang wrote:
> > Hi Max
> >
> > The patch fixed the timeout issue when I use one non-debug kernel,
> > but when I tested on debug kernel with your patches, the timeout still
> > can be triggered with "nvme reset/nvme disconnect-all" operations.
> >
> > On Tue, Feb 15, 2022 at 10:31 PM Max Gurtovoy <mgurtovoy at nvidia.com> wrote:
> >> Thanks Yi Zhang.
> >>
> >> Few years ago I've sent some patches that were supposed to fix the KA
> >> mechanism but eventually they weren't accepted.
> >>
> >> I haven't tested it since but maybe you can run some tests with it.
> >>
> >> The attached patches are partial and include only rdma transport for
> >> your testing.
> >>
> >> If it work for you we can work on it again and argue for correctness.
> >>
> >> Please don't use the workaround we suggested earlier with these patches.
> >>
> >> -Max.
> >>
> >> On 2/15/2022 3:52 PM, Yi Zhang wrote:
> >>> Hi Sagi/Max
> >>>
> >>> Changing the value to 10 or 15 fixed the timeout issue.
> >>> And the reset operation still needs more than 12s on my environment, I
> >>> also tried disabling the pi_enable, the reset operation will be back
> >>> to 3s, so seems the added 9s was due to the PI enabled code path.
> >>>
> >>> On Mon, Feb 14, 2022 at 8:12 PM Max Gurtovoy <mgurtovoy at nvidia.com> wrote:
> >>>> On 2/14/2022 1:32 PM, Sagi Grimberg wrote:
> >>>>>> Hi Sagi/Max
> >>>>>> Here are more findings with the bisect:
> >>>>>>
> >>>>>> The time for reset operation changed from 3s[1] to 12s[2] after
> >>>>>> commit[3], and after commit[4], the reset operation timeout at the
> >>>>>> second reset[5], let me know if you need any testing for it, thanks.
> >>>>> Does this at least eliminate the timeout?
> >>>>> --
> >>>>> diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
> >>>>> index a162f6c6da6e..60e415078893 100644
> >>>>> --- a/drivers/nvme/host/nvme.h
> >>>>> +++ b/drivers/nvme/host/nvme.h
> >>>>> @@ -25,7 +25,7 @@ extern unsigned int nvme_io_timeout;
> >>>>>    extern unsigned int admin_timeout;
> >>>>>    #define NVME_ADMIN_TIMEOUT     (admin_timeout * HZ)
> >>>>>
> >>>>> -#define NVME_DEFAULT_KATO      5
> >>>>> +#define NVME_DEFAULT_KATO      10
> >>>>>
> >>>>>    #ifdef CONFIG_ARCH_NO_SG_CHAIN
> >>>>>    #define  NVME_INLINE_SG_CNT  0
> >>>>> --
> >>>>>
> >>>> or for the initial test you can use --keep-alive-tmo=<10 or 15> flag in
> >>>> the connect command
> >>>>
> >>>>>> [1]
> >>>>>> # time nvme reset /dev/nvme0
> >>>>>>
> >>>>>> real 0m3.049s
> >>>>>> user 0m0.000s
> >>>>>> sys 0m0.006s
> >>>>>> [2]
> >>>>>> # time nvme reset /dev/nvme0
> >>>>>>
> >>>>>> real 0m12.498s
> >>>>>> user 0m0.000s
> >>>>>> sys 0m0.006s
> >>>>>> [3]
> >>>>>> commit 5ec5d3bddc6b912b7de9e3eb6c1f2397faeca2bc (HEAD)
> >>>>>> Author: Max Gurtovoy <maxg at mellanox.com>
> >>>>>> Date:   Tue May 19 17:05:56 2020 +0300
> >>>>>>
> >>>>>>        nvme-rdma: add metadata/T10-PI support
> >>>>>>
> >>>>>> [4]
> >>>>>> commit a70b81bd4d9d2d6c05cfe6ef2a10bccc2e04357a (HEAD)
> >>>>>> Author: Hannes Reinecke <hare at suse.de>
> >>>>>> Date:   Fri Apr 16 13:46:20 2021 +0200
> >>>>>>
> >>>>>>        nvme: sanitize KATO setting-
> >>>>> This change effectively changed the keep-alive timeout
> >>>>> from 15 to 5 and modified the host to send keepalives every
> >>>>> 2.5 seconds instead of 5.
> >>>>>
> >>>>> I guess that in combination that now it takes longer to
> >>>>> create and delete rdma resources (either qps or mrs)
> >>>>> it starts to timeout in setups where there are a lot of
> >>>>> queues.
> >
> >
>


-- 
Best Regards,
  Yi Zhang