[bug report] NVMe/IB: reset_controller need more than 1min
Sagi Grimberg
sagi at grimberg.me
Sun Mar 20 06:03:58 PDT 2022
>>> These are very long times for a non-debug kernel...
>>> Max, do you see the root cause for this?
>>>
>>> Yi, does this happen with rxe/siw as well?
>> Hi Sagi
>>
>> rxe/siw will take less than 1s
>> with rdma_rxe
>> # time nvme reset /dev/nvme0
>> real 0m0.094s
>> user 0m0.000s
>> sys 0m0.006s
>>
>> with siw
>> # time nvme reset /dev/nvme0
>> real 0m0.097s
>> user 0m0.000s
>> sys 0m0.006s
>>
>> This is only reproducible with mlx IB card, as I mentioned before, the
>> reset operation time changed from 3s to 12s after the below commit,
>> could you check this commit?
>>
>> commit 5ec5d3bddc6b912b7de9e3eb6c1f2397faeca2bc
>> Author: Max Gurtovoy <maxg at mellanox.com>
>> Date: Tue May 19 17:05:56 2020 +0300
>>
>> nvme-rdma: add metadata/T10-PI support
>>
> I couldn't repro these long reset times.
It appears to be when setting up a controller with lots of queues
maybe?
> Nevertheless, the above commit added T10-PI offloads.
>
> In this commit, for supported devices we create extra resources in HW
> (more memory keys per task).
>
> I suggested doing this configuration as part of the "nvme connect"
> command and save this resource allocation by default but during the
> review I was asked to make it the default behavior.
Don't know if I gave you this feedback or not, but it probably didn't
occur to the commenter that it will make the connection establishment
take tens of seconds.
> Sagi/Christoph,
>
> WDYT ? should we reconsider the "nvme connect --with_metadata" option ?
Maybe you can make these lazily allocated?
More information about the Linux-nvme
mailing list