[PATCH] nvme-rdma: set ack timeout of RoCE to 262ms

Max Gurtovoy mgurtovoy at nvidia.com
Mon Aug 22 08:30:27 PDT 2022


On 8/22/2022 12:50 PM, Chao Leng wrote:
>
>
> On 2022/8/21 14:20, Christoph Hellwig wrote:
>> On Fri, Aug 19, 2022 at 03:58:25PM +0800, Chao Leng wrote:
>>> Now the ack timeout of RoCE is 2 second(2^(18+1)*4us=2 second). In the
>>> case of low concurrency, if some packets lost due to network abnormal
>>> such as network rerouting, Optical fiber signal interference, etc,
>>> it will wait 2 second to try retransmitting the lost packets.
>>> As a result, the I/O latency is greater than 2 seconds.
>>> The I/O latency is so long for real-time transaction service. Indeed we
>>> do not have to wait so long time to make sure that packets are lost.
>>> Setting the ack timeout to 262ms(2^(15+1)*4us=262ms) is sufficient.
>>
>> I'll leave people more familar with RoCE to judge the merits of this
>> change, but I really want a comment explaining the choice in the
>> source code.
> Now the TCP retransmission timeout interval is 250ms, and this setting
> has been maintained for many years.
> The network quality of rdma is better than that of common Ethernet.
> That is the reason to set 262ms as the default ack timeout.
> Adding a module parameter may be a better option.

Are you solving a real issue you encountered ?

If so, which devices did you use ?

>>
>> .
>>
>



More information about the Linux-nvme mailing list