[PATCH 3/3] nvme-tcp: per-controller I/O workqueues

Mon Jul 8 07:41:37 PDT 2024

On 08/07/2024 15:48, Hannes Reinecke wrote:
> On 7/8/24 14:12, Sagi Grimberg wrote:
>>
>>
>> On 08/07/2024 10:10, Hannes Reinecke wrote:
>>> From: Hannes Reinecke <hare at suse.de>
>>>
>>> Implement per-controller I/O workqueues to reduce workqueue contention
>>> during I/O and improve I/O performance.
>>>
>>> Performance comparison:
>>>                 baseline rx/tx    blk-mq   multiple workqueues
>>> 4k seq write:  449MiB/s 480MiB/s 524MiB/s 540MiB/s
>>> 4k rand write: 410MiB/s 481MiB/s 524MiB/s 539MiB/s
>>> 4k seq read:   478MiB/s 481MiB/s 566MiB/s 582MiB/s
>>> 4k rand read:  547MiB/s 480MiB/s 511MiB/s 633MiB/s
>>
>> I am still puzzled by this one.
>>
>> This is for 2 controllers? or more?
>> It is intresting that the rand read sees higher boost from the seq read.
>> Is this a nature of the SSD? What happens with null_blk ?
>>
> See the patchset description.
> Two controllers, one subsystem.

Can we try with null_blk as well?
Not that I'm dismissing the nvme numbers.

>
>> CCing Tejun. Is it possible that using two different workqueues
>> for a symmetrical workload is better than a single global workqueue?
>
> Yes, that is the implication.
> It might simply due to the fact that the number of workers in a 
> workqueue pool is limited to 512:
>
> workqueue.h:      WQ_MAX_ACTIVE           = 512,    /* I like 512, 
> better ideas? */
>
> so by using a workqueue per controller we effectively raise the number
> of possible workers. And we're reducing the concurrency between 
> controllers, as each controller can now schedule I/O independent on
> the other.
>
> Let me check what would happen if I increase MAX_ACTIVE ...

Umm, do you have concurrency of more than 512 active? you have 64 queues 
in your setup no.
Maybe I'm missing something...