[bug report] NVMe/IB: reset_controller need more than 1min

Sagi Grimberg sagi at grimberg.me
Sun Mar 20 06:03:58 PDT 2022


>>> These are very long times for a non-debug kernel...
>>> Max, do you see the root cause for this?
>>>
>>> Yi, does this happen with rxe/siw as well?
>> Hi Sagi
>>
>> rxe/siw will take less than 1s
>> with rdma_rxe
>> # time nvme reset /dev/nvme0
>> real 0m0.094s
>> user 0m0.000s
>> sys 0m0.006s
>>
>> with siw
>> # time nvme reset /dev/nvme0
>> real 0m0.097s
>> user 0m0.000s
>> sys 0m0.006s
>>
>> This is only reproducible with mlx IB card, as I mentioned before, the
>> reset operation time changed from 3s to 12s after the below commit,
>> could you check this commit?
>>
>> commit 5ec5d3bddc6b912b7de9e3eb6c1f2397faeca2bc
>> Author: Max Gurtovoy <maxg at mellanox.com>
>> Date:   Tue May 19 17:05:56 2020 +0300
>>
>>      nvme-rdma: add metadata/T10-PI support
>>
> I couldn't repro these long reset times.

It appears to be when setting up a controller with lots of queues
maybe?

> Nevertheless, the above commit added T10-PI offloads.
> 
> In this commit, for supported devices we create extra resources in HW 
> (more memory keys per task).
> 
> I suggested doing this configuration as part of the "nvme connect" 
> command and save this resource allocation by default but during the 
> review I was asked to make it the default behavior.

Don't know if I gave you this feedback or not, but it probably didn't
occur to the commenter that it will make the connection establishment
take tens of seconds.

> Sagi/Christoph,
> 
> WDYT ? should we reconsider the "nvme connect --with_metadata" option ?

Maybe you can make these lazily allocated?



More information about the Linux-nvme mailing list