[PATCH 1/3] nvme-tcp: improve rx/tx fairness

Tue Jul 9 00:06:13 PDT 2024

On 09/07/2024 9:51, Hannes Reinecke wrote:
> On 7/8/24 21:31, Sagi Grimberg wrote:
>>
>>
>> On 08/07/2024 18:50, Hannes Reinecke wrote:
> [ .. ]
>>>
>>> Weellll ... if 'SOCK_NOSPACE' is for blocking sockets, why do we 
>>> even get the 'write_space()' callback?
>>> It gets triggered quite often, and checking for the SOCK_NOSPACE bit 
>>> before sending drops the number of invocations quite significantly.
>>
>> I need to check, but where have you testing it? in the inline path?
>> Thinking further, I do agree that because the io_work is shared, we may
>> sendmsg immediately as space is becomes available instead of some 
>> minimum space.
>>  > Can you please quantify this with your testing?
>> How many times we get write_space() callback? how many times we get 
>> EAGAIN, and what
>> is the perf... Also interesting if this is more apparent in specific 
>> workloads. I'm assuming its
>> more apparent with large write workloads.
>
> Stats for my testing
> (second row is controller 1, third row controller 2):

Which row? you mean column?

>
> write_space:
> 0: 0 0
> 1: 489 2
> 10: 325 20
> 11: 28 1
> 12: 14 1
> 13: 252 1
> 14: 100 12
> 15: 310 1
> 16: 454 2
> 2: 31 11
> 3: 50 1
> 4: 299 42
> 5: 1 19
> 6: 12 0
> 7: 737 16
> 8: 636 1
> 9: 19 1

These are incrementing counters where ctrl1 and ctrl2 got the 
.write_space() callbacks?
What are 0-16? queues? I thought you had 32 queues...

>
> queue_busy:
> 0: 0 0
> 1: 396 2
> 10: 66 18
> 11: 91 8
> 12: 56 14
> 13: 464 7
> 14: 24 15
> 15: 574 8
> 16: 516 9
> 2: 22 29
> 3: 56 5
> 4: 153 99
> 5: 2 40
> 6: 18 0
> 7: 632 5
> 8: 590 13
> 9: 129 2

What is queue_busy?

>
> The send latency is actually pretty good, and around 150 us on 
> controller 1

150us for what? the average duration of sendmsg call alone?

i.e.
start
sendmsg
end

> (with two notable exceptions of 480 us on queue 5 and 646 us on queue 
> 6) and around 100 us on controller 2 (again with exceptions of 444 us 
> on queue 5 and 643 us on queue 6). Each queue processing around 150k 
> requests.

Are you referring to the above measurements? I'm having trouble 
understanding what you wrote...

>
> Receive latency is far more consistent; each queue has around 150 us 
> on controller 1 and 140 us on controller 2.

Again, latency of what?