[bug report] NVMe/IB: reset_controller need more than 1min
Haakon Bugge
haakon.bugge at oracle.com
Fri Dec 17 03:19:59 PST 2021
> On 17 Dec 2021, at 08:03, Yi Zhang <yi.zhang at redhat.com> wrote:
>
> On Fri, Dec 17, 2021 at 1:34 AM Haakon Bugge <haakon.bugge at oracle.com> wrote:
>>
>>
>>
>>> On 16 Dec 2021, at 17:32, Yi Zhang <yi.zhang at redhat.com> wrote:
>>>
>>> On Thu, Dec 16, 2021 at 9:21 PM Max Gurtovoy <mgurtovoy at nvidia.com> wrote:
>>>>
>>>>
>>>> On 12/16/2021 4:18 AM, Yi Zhang wrote:
>>>>> On Wed, Dec 15, 2021 at 8:10 PM Max Gurtovoy <mgurtovoy at nvidia.com> wrote:
>>>>>>
>>>>>> On 12/15/2021 3:15 AM, Yi Zhang wrote:
>>>>>>> On Tue, Dec 14, 2021 at 8:01 PM Max Gurtovoy <mgurtovoy at nvidia.com> wrote:
>>>>>>>> On 12/14/2021 12:39 PM, Sagi Grimberg wrote:
>>>>>>>>>>>> Hi Sagi
>>>>>>>>>>>> It is still reproducible with the change, here is the log:
>>>>>>>>>>>>
>>>>>>>>>>>> # time nvme reset /dev/nvme0
>>>>>>>>>>>>
>>>>>>>>>>>> real 0m12.973s
>>>>>>>>>>>> user 0m0.000s
>>>>>>>>>>>> sys 0m0.006s
>>>>>>>>>>>> # time nvme reset /dev/nvme0
>>>>>>>>>>>>
>>>>>>>>>>>> real 1m15.606s
>>>>>>>>>>>> user 0m0.000s
>>>>>>>>>>>> sys 0m0.007s
>>>>>>>>>>> Does it speed up if you use less queues? (i.e. connect with -i 4) ?
>>>>>>>>>> Yes, with -i 4, it has stablee 1.3s
>>>>>>>>>> # time nvme reset /dev/nvme0
>>>>>>>>> So it appears that destroying a qp takes a long time on
>>>>>>>>> IB for some reason...
>>>>>>>>>
>>>>>>>>>> real 0m1.225s
>>>>>>>>>> user 0m0.000s
>>>>>>>>>> sys 0m0.007s
>>>>>>>>>>
>>>>>>>>>>>> # dmesg | grep nvme
>>>>>>>>>>>> [ 900.634877] nvme nvme0: resetting controller
>>>>>>>>>>>> [ 909.026958] nvme nvme0: creating 40 I/O queues.
>>>>>>>>>>>> [ 913.604297] nvme nvme0: mapped 40/0/0 default/read/poll queues.
>>>>>>>>>>>> [ 917.600993] nvme nvme0: resetting controller
>>>>>>>>>>>> [ 988.562230] nvme nvme0: I/O 2 QID 0 timeout
>>>>>>>>>>>> [ 988.567607] nvme nvme0: Property Set error: 881, offset 0x14
>>>>>>>>>>>> [ 988.608181] nvme nvme0: creating 40 I/O queues.
>>>>>>>>>>>> [ 993.203495] nvme nvme0: mapped 40/0/0 default/read/poll queues.
>>>>>>>>>>>>
>>>>>>>>>>>> BTW, this issue cannot be reproduced on my NVME/ROCE environment.
>>>>>>>>>>> Then I think that we need the rdma folks to help here...
>>>>>>>>> Max?
>>>>>>>> It took me 12s to reset a controller with 63 IO queues with 5.16-rc3+.
>>>>>>>>
>>>>>>>> Can you try repro with latest versions please ?
>>>>>>>>
>>>>>>>> Or give the exact scenario ?
>>>>>>> Yeah, both target and client are using Mellanox Technologies MT27700
>>>>>>> Family [ConnectX-4], could you try stress "nvme reset /dev/nvme0", the
>>>>>>> first time reset will take 12s, and it always can be reproduced at the
>>>>>>> second reset operation.
>>>>>> I created a target with 1 namespace backed by null_blk and connected to
>>>>>> it from the same server in loopback rdma connection using the ConnectX-4
>>>>>> adapter.
>>>>> Could you share your loop.json file so I can try it on my environment?
>>>>
>>>> {
>>>> "hosts": [],
>>>> "ports": [
>>>> {
>>>> "addr": {
>>>> "adrfam": "ipv4",
>>>> "traddr": "<ip>",
>>>> "treq": "not specified",
>>>> "trsvcid": "4420",
>>>> "trtype": "rdma"
>>>> },
>>>> "portid": 1,
>>>> "referrals": [],
>>>> "subsystems": [
>>>> "testsubsystem_0"
>>>> ]
>>>> }
>>>> ],
>>>> "subsystems": [
>>>> {
>>>> "allowed_hosts": [],
>>>> "attr": {
>>>> "allow_any_host": "1",
>>>> "cntlid_max": "65519",
>>>> "cntlid_min": "1",
>>>> "model": "Linux",
>>>> "serial": "3d83c78b76623f1d",
>>>> "version": "1.3"
>>>> },
>>>> "namespaces": [
>>>> {
>>>> "device": {
>>>> "nguid": "5b722b05-e9b6-542d-ba80-62010b57775d",
>>>> "path": "/dev/nullb0",
>>>> "uuid": "26ffc8ce-73b4-321d-9685-7d7a9872c460"
>>>> },
>>>> "enable": 1,
>>>> "nsid": 1
>>>> }
>>>> ],
>>>> "nqn": "testsubsystem_0"
>>>> }
>>>> ]
>>>> }
>>>
>>> Thanks, I reproduced it with your json file on one server with
>>> loopback rdma connection:
>>> # time nvme connect -t rdma -a 172.31.0.202 -s 4420 -n testsubsystem_0
>>>
>>> real 0m4.557s
>>> user 0m0.000s
>>> sys 0m0.005s
>>> # time nvme reset /dev/nvme0
>>>
>>> real 0m13.176s
>>> user 0m0.000s
>>> sys 0m0.007s
>>> # time nvme reset /dev/nvme0
>>>
>>> real 1m16.577s
>>> user 0m0.002s
>>> sys 0m0.005s
>>> # time nvme disconnect -n testsubsystem_0
>>
>> What does:
>>
>> # rdma res show qp | grep -c ERR
>>
>> say, when it is slow?
>
> Hi Haakon
> Here is the output during the reset operation:
>
> 38
> 33
> 28
> 24
> 19
> 14
> 9
> 4
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 0
> 0
> 0
OK. I had a hunch that many QPs in the ERR state could be the reason. But this is not many, so I rest my case :-) Thanks for trying though!
Thxs, Håkon
>
>>
>>
>>
>> Thxs, Håkon
>>
>
>
>
> --
> Best Regards,
> Yi Zhang
More information about the Linux-nvme
mailing list