[PATCH] nvme-rdma: set ack timeout of RoCE to 262ms

Chao Leng lengchao at huawei.com
Mon Aug 29 06:15:50 PDT 2022



On 2022/8/29 17:06, Sagi Grimberg wrote:
> 
>>>>> If so, which devices did you use ?
>>>> The host HBA is Mellanox Technologies MT27800 Family [ConnectX-5];
>>>> The switch and storage are huawei equipments.
>>>> In principle, switches and storage devices from other vendors
>>>> have the same problem.
>>>> If you think it is necessary, we can test the other vendor switchs
>>>> and linux target.
>>>
>>> Why is the 2s default chosen, what is the downside for a 250ms seconds ack timeout? and why is nvme-rdma different than all other kernel rdma
>> The downside is redundant retransmit if the packets delay more than
>> 250ms in the networks and finally reaches the receiver.
>> Only in extreme scenarios, the packet delay may exceed 250 ms.
> 
> Sounds like the default needs to be changed if it only addresses the
> extreme scenarios...
> 
>>> consumers that it needs to set this explicitly?
>> The real-time transaction services are sensitive to the delay.
>> nvme-rdma will be used in real-time transactions.
>> The real-time transaction services do not allow that the packets
>> delay more than 250ms in the networks.
>> So we need to set the ack timeout to 262ms.
> 
> While I don't disagree with the change itself, I do disagree why this
> needs to be driven by nvme-rdma locally. If all kernel rdma consumers
> need this (and if not, I'd like to understand why), this needs to be set in the rdma core.Changing the default set in the rdma core is another option.
But it will affect all application based on RDMA.
Max, what do you think? Thank you.
> .



More information about the Linux-nvme mailing list