don't reorder requests passed to ->queue_rqs
Christoph Hellwig
hch at lst.de
Wed Nov 13 07:20:40 PST 2024
Hi Jens,
currently blk-mq reorders requests when adding them to the plug because
the request list can't do efficient tail appends. When the plug is
directly issued using ->queue_rqs that means reordered requests are
passed to the driver, which can lead to very bad I/O patterns when
not corrected, especially on rotational devices (e.g. NVMe HDD) or
when using zone append.
This series first adds two easily backportable workarounds to reverse
the reording in the virtio_blk and nvme-pci ->queue_rq implementations
similar to what the non-queue_rqs path does, and then adds a rq_list
type that allows for efficient tail insertions and uses that to fix
the reordering for real and then does the same for I/O completions as
well.
Diffstat:
block/blk-core.c | 6 +-
block/blk-merge.c | 2
block/blk-mq.c | 42 ++++++++---------
block/blk-mq.h | 2
drivers/block/null_blk/main.c | 9 +--
drivers/block/virtio_blk.c | 53 ++++++++++------------
drivers/nvme/host/apple.c | 2
drivers/nvme/host/pci.c | 46 ++++++++-----------
include/linux/blk-mq.h | 99 ++++++++++++++++++++----------------------
include/linux/blkdev.h | 11 +++-
io_uring/rw.c | 4 -
11 files changed, 133 insertions(+), 143 deletions(-)
More information about the Linux-nvme
mailing list