[PATCH for-next 2/4] io_uring: introduce fixed buffer support for io_uring_cmd

Kanchan Joshi joshi.k at samsung.com
Mon Aug 22 04:33:41 PDT 2022


On Mon, Aug 22, 2022 at 11:58:24AM +0100, Pavel Begunkov wrote:
>On 8/19/22 11:30, Kanchan Joshi wrote:
>>From: Anuj Gupta <anuj20.g at samsung.com>
>>
>>Add IORING_OP_URING_CMD_FIXED opcode that enables sending io_uring
>>command with previously registered buffers. User-space passes the buffer
>>index in sqe->buf_index, same as done in read/write variants that uses
>>fixed buffers.
>>
>>Signed-off-by: Anuj Gupta <anuj20.g at samsung.com>
>>Signed-off-by: Kanchan Joshi <joshi.k at samsung.com>
>>---
>>  include/linux/io_uring.h      |  5 ++++-
>>  include/uapi/linux/io_uring.h |  1 +
>>  io_uring/opdef.c              | 10 ++++++++++
>>  io_uring/rw.c                 |  3 ++-
>>  io_uring/uring_cmd.c          | 18 +++++++++++++++++-
>>  5 files changed, 34 insertions(+), 3 deletions(-)
>>
>>diff --git a/include/linux/io_uring.h b/include/linux/io_uring.h
>>index 60aba10468fc..40961d7c3827 100644
>>--- a/include/linux/io_uring.h
>>+++ b/include/linux/io_uring.h
>>@@ -5,6 +5,8 @@
>>  #include <linux/sched.h>
>>  #include <linux/xarray.h>
>>+#include<uapi/linux/io_uring.h>
>>+
>>  enum io_uring_cmd_flags {
>>  	IO_URING_F_COMPLETE_DEFER	= 1,
>>  	IO_URING_F_UNLOCKED		= 2,
>>@@ -15,6 +17,7 @@ enum io_uring_cmd_flags {
>>  	IO_URING_F_SQE128		= 4,
>>  	IO_URING_F_CQE32		= 8,
>>  	IO_URING_F_IOPOLL		= 16,
>>+	IO_URING_F_FIXEDBUFS		= 32,
>>  };
>>  struct io_uring_cmd {
>>@@ -33,7 +36,7 @@ struct io_uring_cmd {
>>  #if defined(CONFIG_IO_URING)
>>  int io_uring_cmd_import_fixed(u64 ubuf, unsigned long len, int rw,
>>-		struct iov_iter *iter, void *ioucmd)
>>+		struct iov_iter *iter, void *ioucmd);
>
>Please try to compile the first patch separately

Indeed, this should have been part of that patch. Thanks.

>>  void io_uring_cmd_done(struct io_uring_cmd *cmd, ssize_t ret, ssize_t res2);
>>  void io_uring_cmd_complete_in_task(struct io_uring_cmd *ioucmd,
>>  			void (*task_work_cb)(struct io_uring_cmd *));
>>diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
>>index 1463cfecb56b..80ea35d1ed5c 100644
>>--- a/include/uapi/linux/io_uring.h
>>+++ b/include/uapi/linux/io_uring.h
>>@@ -203,6 +203,7 @@ enum io_uring_op {
>>  	IORING_OP_SOCKET,
>>  	IORING_OP_URING_CMD,
>>  	IORING_OP_SENDZC_NOTIF,
>>+	IORING_OP_URING_CMD_FIXED,
>
>I don't think it should be another opcode, is there any
>control flags we can fit it in?

using sqe->rw_flags could be another way.
But I think that may create bit of disharmony in user-space.
Current choice (IORING_OP_URING_CMD_FIXED) is along the same lines as
IORING_OP_READ/WRITE_FIXED. User-space uses new opcode, and sends the
buffer by filling sqe->buf_index. 
So must we take a different way?

>>  	/* this goes last, obviously */
>>  	IORING_OP_LAST,
>>diff --git a/io_uring/opdef.c b/io_uring/opdef.c
>>index 9a0df19306fe..7d5731b84c92 100644
>>--- a/io_uring/opdef.c
>>+++ b/io_uring/opdef.c
>>@@ -472,6 +472,16 @@ const struct io_op_def io_op_defs[] = {
>>  		.issue			= io_uring_cmd,
>>  		.prep_async		= io_uring_cmd_prep_async,
>>  	},
>>+	[IORING_OP_URING_CMD_FIXED] = {
>>+		.needs_file		= 1,
>>+		.plug			= 1,
>>+		.name			= "URING_CMD_FIXED",
>>+		.iopoll			= 1,
>>+		.async_size		= uring_cmd_pdu_size(1),
>>+		.prep			= io_uring_cmd_prep,
>>+		.issue			= io_uring_cmd,
>>+		.prep_async		= io_uring_cmd_prep_async,
>>+	},
>>  	[IORING_OP_SENDZC_NOTIF] = {
>>  		.name			= "SENDZC_NOTIF",
>>  		.needs_file		= 1,
>>diff --git a/io_uring/rw.c b/io_uring/rw.c
>>index 1a4fb8a44b9a..3c7b94bffa62 100644
>>--- a/io_uring/rw.c
>>+++ b/io_uring/rw.c
>>@@ -1005,7 +1005,8 @@ int io_do_iopoll(struct io_ring_ctx *ctx, bool force_nonspin)
>>  		if (READ_ONCE(req->iopoll_completed))
>>  			break;
>>-		if (req->opcode == IORING_OP_URING_CMD) {
>>+		if (req->opcode == IORING_OP_URING_CMD ||
>>+				req->opcode == IORING_OP_URING_CMD_FIXED) {
>
>I don't see the changed chunk upstream

Right, it is on top of iopoll support (plus one more series mentioned in
covered letter). Here is the link - 
https://lore.kernel.org/linux-block/20220807183607.352351-1-joshi.k@samsung.com/
It would be great if you could review that.


More information about the Linux-nvme mailing list