[PATCH v5 0/6] io_uring passthrough for nvme

Kanchan Joshi joshi.k at samsung.com
Tue May 10 22:47:44 PDT 2022


This series is against "for-5.19/io_uring-passthrough" branch (linux-block).
Patches to be refreshed on top of 2bb04df7c ("io_uring: support CQE32").

uring-cmd is the facility to enable io_uring capabilities (async is one
of those) for any arbitrary command (ioctl, fsctl or whatever else)
exposed by the command-providers (driver, fs etc.). The series
introduces uring-cmd, and connects nvme passthrough (over generic device
/dev/ngXnY) to it.

uring-cmd is specified by IORING_OP_URING_CMD. The storage for the
command is provided in the SQE itself. On a regular ring, 16 bytes of
space is available, which can be accessed using "sqe->cmd".
Alternatively, application can setup the ring with the flag
IORING_SETUP_SQE128. In that case, each SQE of the ring is 128b in size,
and provides 80b of storage space for placing the command.

nvme io-passthrough is specified by new operation NVME_URING_CMD_IO.
This operates on a new structure nvme_uring_cmd which is 72b in size.
nvme passthrough requires two results to be returned to user-space.
Therefore ring needs to be setup with the flag IORING_SETUP_CQE32.
When this flag is specified, each CQE of the ring is 32b in size.
The uring-cmd infrastructure exports helpers so that additional result
is collected from the provider and placed into the CQE.

Testing is done using this custom fio branch:
https://github.com/joshkan/fio/tree/big-cqe-pt.v4
regular io_uring io (read/write) is turned into passthrough io on
supplying "-uring_cmd=1" option.

Example command line:
fio -iodepth=1 -rw=randread -ioengine=io_uring -bs=4k -numjobs=1 -size=4k -group_reporting -filename=/dev/ng0n1 -name=io_uring_1 -uring_cmd=1

Changes since v4:
https://lore.kernel.org/linux-nvme/20220505060616.803816-1-joshi.k@samsung.com/
- Allow uring-cmd to operate on regular ring too
- Move big-sqe/big-cqe requirement to nvme
- Add support for cases when uring-cmd needs deferral
- Redone Patch 3
- In nvme, use READ_ONCE while reading cmd fields from SQE
- Refactoring in Patch 4 based on the feedback of Christoph

Changes since v3:
https://lore.kernel.org/linux-nvme/20220503184831.78705-1-p.raghav@samsung.com/
- Cleaned up placements of new fields in sqe and io_uring_cmd
- Removed packed-attribute from nvme_uring_cmd struct
- Applied all other Christoph's feedback too
- Applied Jens feedback
- Collected reviewed-by

Changes since v2:
https://lore.kernel.org/linux-nvme/20220401110310.611869-1-joshi.k@samsung.com/
- Rewire uring-cmd infrastructure on top of new big CQE
- Prep patch (refactored) and feedback from Christoph
- Add new opcode and structure in nvme for uring-cmd
- Enable vectored-io

Changes since v1:
https://lore.kernel.org/linux-nvme/20220308152105.309618-1-joshi.k@samsung.com/
- Trim down by removing patches for polling, fixed-buffer and bio-cache
- Add big CQE and move uring-cmd to use that
- Remove indirect (pointer) submission

Anuj Gupta (1):
  nvme: add vectored-io support for uring-cmd

Christoph Hellwig (1):
  nvme: refactor nvme_submit_user_cmd()

Jens Axboe (3):
  fs,io_uring: add infrastructure for uring-cmd
  block: wire-up support for passthrough plugging
  io_uring: finish IOPOLL/ioprio prep handler removal

Kanchan Joshi (1):
  nvme: wire-up uring-cmd support for io-passthru on char-device.

 block/blk-mq.c                  |  73 +++++-----
 drivers/nvme/host/core.c        |   1 +
 drivers/nvme/host/ioctl.c       | 247 ++++++++++++++++++++++++++++++--
 drivers/nvme/host/multipath.c   |   1 +
 drivers/nvme/host/nvme.h        |   4 +
 fs/io_uring.c                   | 135 ++++++++++++++---
 include/linux/fs.h              |   2 +
 include/linux/io_uring.h        |  33 +++++
 include/uapi/linux/io_uring.h   |  21 +--
 include/uapi/linux/nvme_ioctl.h |  26 ++++
 10 files changed, 471 insertions(+), 72 deletions(-)

-- 
2.25.1




More information about the Linux-nvme mailing list