[PATCH RFC 0/2] block,nvme: latency-based I/O scheduler

Hannes Reinecke hare at kernel.org
Tue Mar 26 08:35:27 PDT 2024


Hi all,

there had been several attempts to implement a latency-based I/O
scheduler for native nvme multipath, all of which had its issues.

So time to start afresh, this time using the QoS framework
already present in the block layer.
It consists of two parts:
- a new 'blk-nodelat' QoS module, which is just a simple per-node
  latency tracker
- a 'latency' nvme I/O policy

Using the 'tiobench' fio script I'm getting:
  WRITE: bw=531MiB/s (556MB/s), 33.2MiB/s-52.4MiB/s
  (34.8MB/s-54.9MB/s), io=4096MiB (4295MB), run=4888-7718msec
    WRITE: bw=539MiB/s (566MB/s), 33.7MiB/s-50.9MiB/s
  (35.3MB/s-53.3MB/s), io=4096MiB (4295MB), run=5033-7594msec
     READ: bw=898MiB/s (942MB/s), 56.1MiB/s-75.4MiB/s
  (58.9MB/s-79.0MB/s), io=4096MiB (4295MB), run=3397-4560msec
     READ: bw=1023MiB/s (1072MB/s), 63.9MiB/s-75.1MiB/s
  (67.0MB/s-78.8MB/s), io=4096MiB (4295MB), run=3408-4005msec

for 'round-robin' and

  WRITE: bw=574MiB/s (601MB/s), 35.8MiB/s-45.5MiB/s
  (37.6MB/s-47.7MB/s), io=4096MiB (4295MB), run=5629-7142msec
    WRITE: bw=639MiB/s (670MB/s), 39.9MiB/s-47.5MiB/s
  (41.9MB/s-49.8MB/s), io=4096MiB (4295MB), run=5388-6408msec
     READ: bw=1024MiB/s (1074MB/s), 64.0MiB/s-73.7MiB/s
  (67.1MB/s-77.2MB/s), io=4096MiB (4295MB), run=3475-4000msec
     READ: bw=1013MiB/s (1063MB/s), 63.3MiB/s-72.6MiB/s
  (66.4MB/s-76.2MB/s), io=4096MiB (4295MB), run=3524-4042msec
  
for 'latency' with 'decay' set to 10.
That's on a 32G FC testbed running against a brd target,
fio running with 16 thread.

As usual, comments and reviews are welcome.

Hannes Reinecke (2):
  block: track per-node I/O latency
  nvme: add 'latency' iopolicy

 block/Kconfig                 |   7 +
 block/Makefile                |   1 +
 block/blk-mq-debugfs.c        |   2 +
 block/blk-nodelat.c           | 368 ++++++++++++++++++++++++++++++++++
 block/blk-rq-qos.h            |   6 +
 drivers/nvme/host/multipath.c |  46 ++++-
 drivers/nvme/host/nvme.h      |   2 +
 include/linux/blk-mq.h        |  11 +
 8 files changed, 439 insertions(+), 4 deletions(-)
 create mode 100644 block/blk-nodelat.c

-- 
2.35.3




More information about the Linux-nvme mailing list