[PATCHv2 0/3] nvme-tcp: improve scalability

Mon Jul 8 00:10:10 PDT 2024

Hi all,

for workloads with a lot of controllers we run into workqueue contention,
where the single workqueue is not able to service requests fast enough,
leading to spurious I/O errors and connect resets during high load.
This patchset improves the situation by improve the fairness between
rx and tx scheduling, introducing per-controller workqueues,
and distribute the load accoring to the blk-mq cpu mapping.
With this we reduce the spurious I/O errors and improve the overall
performance for highly contended workloads.

All performance number are derived from the 'tiobench-example.fio'
sample from the fio sources, running on a 96 core machine with one
subsystem and two paths, each path exposing 32 queues.
Backend is nvmet using an Intel DC P3700 NVMe SSD.

Changes to the initial submission:
- Make the changes independent from the 'wq_unbound' parameter
- Drop changes to the workqueue
- Add patch to improve rx/tx fairness

Hannes Reinecke (3):
  nvme-tcp: improve rx/tx fairness
  nvme-tcp: align I/O cpu with blk-mq mapping
  nvme-tcp: per-controller I/O workqueues

 drivers/nvme/host/tcp.c | 135 ++++++++++++++++++++++++++++------------
 1 file changed, 95 insertions(+), 40 deletions(-)

-- 
2.35.3