[PATCHv2] nvme-tcp: align I/O cpu with blk-mq mapping

Sagi Grimberg sagi at grimberg.me
Mon Jun 24 23:49:40 PDT 2024



On 24/06/2024 23:01, Kamaljit Singh wrote:
> On 19/06/2024 18:58, Sagi Grimberg wrote:
>>>>> I see how you address multiple controllers falling into the same
>>>>> mappings case in your patch.
>>>>> You could have selected a different mq_map entry for each controller
>>>>> (out of the entries that map to the qid).
>>>>>
>>>> Looked at it, but hadn't any idea how to figure out the load.
>>>> The load is actually per-cpu, but we only have per controller
>>>> structures.
>>>> So we would need to introduce a per-cpu counter, detailing out the
>>>> number of queues scheduled on that CPU.
>>>> But that won't help with the CPU oversubscription issue; we still
>>>> might have substantially higher number of overall queues than we have
>>>> CPUs...
>>> I think that it would still be better than what you have right now:
>>>
>>> IIUC Right now you will have for all controllers (based on your example):
>>> queue 1: using cpu 6
>>> queue 2: using cpu 9
>>> queue 3: using cpu 18
>>>
>>> But selecting a different mq_map entry can give:
>>> ctrl1:
>>> queue 1: using cpu 6
>>> queue 2: using cpu 9
>>> queue 3: using cpu 18
>>>
>>> ctrl2:
>>> queue 1: using cpu 7
>>> queue 2: using cpu 10
>>> queue 3: using cpu 19
>>>
>>> ctrl3:
>>> queue 1: using cpu 8
>>> queue 2: using cpu 11
>>> queue 3: using cpu 20
>>>
>>> ctrl4:
>>> queue 1: using cpu 54
>>> queue 2: using cpu 57
>>> queue 3: using cpu 66
>>>
>>> and so on...
>> Hey Hannes,
>>
>> Did you make progress with this one?
> Sagi,
> Looks like this change should also address the other NVMe/TCP issue that I had
> reported for WQ_UNBOUND, agree?

Well, this reduces load from the nvme-tcp io_work, so it makes sense 
that it is shorter now.
However its not clear to me that we are not spending these 10ms in 
softirq in your test case.

Is it possible to measure it? with adding debug prints or trace events?

>   If so, would this also be merged into
> 6.8.xx branch as well?

I don't know if this will be considered as a bug fix. the complaint is 
advisory really, and does not affect
the stability of the driver afaiu.



More information about the Linux-nvme mailing list