[PATCH for-next 2/4] io_uring: introduce fixed buffer support for io_uring_cmd
Kanchan Joshi
joshi.k at samsung.com
Mon Aug 22 04:33:41 PDT 2022
On Mon, Aug 22, 2022 at 11:58:24AM +0100, Pavel Begunkov wrote:
>On 8/19/22 11:30, Kanchan Joshi wrote:
>>From: Anuj Gupta <anuj20.g at samsung.com>
>>
>>Add IORING_OP_URING_CMD_FIXED opcode that enables sending io_uring
>>command with previously registered buffers. User-space passes the buffer
>>index in sqe->buf_index, same as done in read/write variants that uses
>>fixed buffers.
>>
>>Signed-off-by: Anuj Gupta <anuj20.g at samsung.com>
>>Signed-off-by: Kanchan Joshi <joshi.k at samsung.com>
>>---
>> include/linux/io_uring.h | 5 ++++-
>> include/uapi/linux/io_uring.h | 1 +
>> io_uring/opdef.c | 10 ++++++++++
>> io_uring/rw.c | 3 ++-
>> io_uring/uring_cmd.c | 18 +++++++++++++++++-
>> 5 files changed, 34 insertions(+), 3 deletions(-)
>>
>>diff --git a/include/linux/io_uring.h b/include/linux/io_uring.h
>>index 60aba10468fc..40961d7c3827 100644
>>--- a/include/linux/io_uring.h
>>+++ b/include/linux/io_uring.h
>>@@ -5,6 +5,8 @@
>> #include <linux/sched.h>
>> #include <linux/xarray.h>
>>+#include<uapi/linux/io_uring.h>
>>+
>> enum io_uring_cmd_flags {
>> IO_URING_F_COMPLETE_DEFER = 1,
>> IO_URING_F_UNLOCKED = 2,
>>@@ -15,6 +17,7 @@ enum io_uring_cmd_flags {
>> IO_URING_F_SQE128 = 4,
>> IO_URING_F_CQE32 = 8,
>> IO_URING_F_IOPOLL = 16,
>>+ IO_URING_F_FIXEDBUFS = 32,
>> };
>> struct io_uring_cmd {
>>@@ -33,7 +36,7 @@ struct io_uring_cmd {
>> #if defined(CONFIG_IO_URING)
>> int io_uring_cmd_import_fixed(u64 ubuf, unsigned long len, int rw,
>>- struct iov_iter *iter, void *ioucmd)
>>+ struct iov_iter *iter, void *ioucmd);
>
>Please try to compile the first patch separately
Indeed, this should have been part of that patch. Thanks.
>> void io_uring_cmd_done(struct io_uring_cmd *cmd, ssize_t ret, ssize_t res2);
>> void io_uring_cmd_complete_in_task(struct io_uring_cmd *ioucmd,
>> void (*task_work_cb)(struct io_uring_cmd *));
>>diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
>>index 1463cfecb56b..80ea35d1ed5c 100644
>>--- a/include/uapi/linux/io_uring.h
>>+++ b/include/uapi/linux/io_uring.h
>>@@ -203,6 +203,7 @@ enum io_uring_op {
>> IORING_OP_SOCKET,
>> IORING_OP_URING_CMD,
>> IORING_OP_SENDZC_NOTIF,
>>+ IORING_OP_URING_CMD_FIXED,
>
>I don't think it should be another opcode, is there any
>control flags we can fit it in?
using sqe->rw_flags could be another way.
But I think that may create bit of disharmony in user-space.
Current choice (IORING_OP_URING_CMD_FIXED) is along the same lines as
IORING_OP_READ/WRITE_FIXED. User-space uses new opcode, and sends the
buffer by filling sqe->buf_index.
So must we take a different way?
>> /* this goes last, obviously */
>> IORING_OP_LAST,
>>diff --git a/io_uring/opdef.c b/io_uring/opdef.c
>>index 9a0df19306fe..7d5731b84c92 100644
>>--- a/io_uring/opdef.c
>>+++ b/io_uring/opdef.c
>>@@ -472,6 +472,16 @@ const struct io_op_def io_op_defs[] = {
>> .issue = io_uring_cmd,
>> .prep_async = io_uring_cmd_prep_async,
>> },
>>+ [IORING_OP_URING_CMD_FIXED] = {
>>+ .needs_file = 1,
>>+ .plug = 1,
>>+ .name = "URING_CMD_FIXED",
>>+ .iopoll = 1,
>>+ .async_size = uring_cmd_pdu_size(1),
>>+ .prep = io_uring_cmd_prep,
>>+ .issue = io_uring_cmd,
>>+ .prep_async = io_uring_cmd_prep_async,
>>+ },
>> [IORING_OP_SENDZC_NOTIF] = {
>> .name = "SENDZC_NOTIF",
>> .needs_file = 1,
>>diff --git a/io_uring/rw.c b/io_uring/rw.c
>>index 1a4fb8a44b9a..3c7b94bffa62 100644
>>--- a/io_uring/rw.c
>>+++ b/io_uring/rw.c
>>@@ -1005,7 +1005,8 @@ int io_do_iopoll(struct io_ring_ctx *ctx, bool force_nonspin)
>> if (READ_ONCE(req->iopoll_completed))
>> break;
>>- if (req->opcode == IORING_OP_URING_CMD) {
>>+ if (req->opcode == IORING_OP_URING_CMD ||
>>+ req->opcode == IORING_OP_URING_CMD_FIXED) {
>
>I don't see the changed chunk upstream
Right, it is on top of iopoll support (plus one more series mentioned in
covered letter). Here is the link -
https://lore.kernel.org/linux-block/20220807183607.352351-1-joshi.k@samsung.com/
It would be great if you could review that.
More information about the Linux-nvme
mailing list