[PATCHv2 0/7] dma mapping optimisations

Keith Busch kbusch at fb.com
Tue Aug 2 12:36:26 PDT 2022


From: Keith Busch <kbusch at kernel.org>

Changes since v1:

  Mapping/unmapping goes through file ops instead of block device ops.
  This abstracts io_uring from knowing about specific block devices. For
  this series, the only "file system" implementing the ops is the raw
  block device, but should be easy enough to add more filesystems if
  needed.

  Mapping register requires io_uring fixed files. This ties the
  registered buffer's lifetime to no more than the file it was
  registered with.

Summary:

A typical journey a user address takes for a read or write to a block
device undergoes various represenations for every IO. Each consumes
memory and CPU cycles. When the backing storage is NVMe, the sequence
looks something like the following:

  __user void *
  struct iov_iter
  struct pages[]
  struct bio_vec[]
  struct scatterlist[]
  __le64[]

Applications will often use the same buffer for many IO, though, so
these potentially costly per-IO transformations to reach the exact same
hardware descriptor can be skipped.

The io_uring interface already provides a way for users to register
buffers to get to the 'struct bio_vec[]'. That still leaves the
scatterlist needed for the repeated dma_map_sg(), then transform to
nvme's PRP list format.

This series takes the registered buffers a step further. A block driver
can implement a new .dma_map() callback to complete the representation
to the hardware's DMA mapped address, and return a cookie so a user can
reference it later for any given IO. When used, the block stack can skip
significant amounts of code, improving CPU utilization, and, if not
bandwidth limited, IOPs.

The implementation is currently limited to mapping a registered buffer
to a single file.

Keith Busch (7):
  blk-mq: add ops to dma map bvec
  file: add ops to dma map bvec
  iov_iter: introduce type for preregistered dma tags
  block: add dma tag bio type
  io_uring: introduce file slot release helper
  io_uring: add support for dma pre-mapping
  nvme-pci: implement dma_map support

 block/bdev.c                   |  20 +++
 block/bio.c                    |  25 ++-
 block/blk-merge.c              |  19 +++
 block/fops.c                   |  20 +++
 drivers/nvme/host/pci.c        | 302 +++++++++++++++++++++++++++++++--
 fs/file.c                      |  15 ++
 include/linux/bio.h            |  21 ++-
 include/linux/blk-mq.h         |  24 +++
 include/linux/blk_types.h      |   6 +-
 include/linux/blkdev.h         |  16 ++
 include/linux/fs.h             |  20 +++
 include/linux/io_uring_types.h |   2 +
 include/linux/uio.h            |   9 +
 include/uapi/linux/io_uring.h  |  12 ++
 io_uring/filetable.c           |  34 ++--
 io_uring/filetable.h           |  10 +-
 io_uring/io_uring.c            | 137 +++++++++++++++
 io_uring/net.c                 |   2 +-
 io_uring/rsrc.c                |  26 +--
 io_uring/rsrc.h                |  10 +-
 io_uring/rw.c                  |   2 +-
 lib/iov_iter.c                 |  24 ++-
 22 files changed, 704 insertions(+), 52 deletions(-)

-- 
2.30.2




More information about the Linux-nvme mailing list