[RFC 3/5] io_uring: add infra and support for IORING_OP_URING_CMD

Christoph Hellwig hch at lst.de
Mon Apr 4 22:58:35 PDT 2022


On Mon, Apr 04, 2022 at 09:20:00AM +0100, Pavel Begunkov wrote:
>> I'm still not a fund of the double indirect call here.  I don't really
>> have a good idea yet, but I plan to look into it.
>
> I haven't familiarised myself with the series properly, but if it's about
> driver_cb, we can expose struct io_kiocb and io_req_task_work_add() so
> the lower layers can implement their own io_task_work.func. Hopefully, it
> won't be inventively abused...

If we move io_kiocb out avoiding one indirection would be very easy
indeed.  But I think that just invites abuse.  Note that we also have
at least one and potentially more indirections in this path.  The
request rq_end_io handler is a guranteed one, and the IPI or softirq
for the request indirectin is another one.  So my plan was to look
into having an io_uring specific hook in the core block code to
deliver completions directly to the right I/O uring thread.  In the
best case that should allow us to do a single indirect call for
the completion instead of 4 and a pointless IPI/softirq.

>>> +	struct io_kiocb *req = container_of(ioucmd, struct io_kiocb, uring_cmd);
>>> +
>>> +	if (ret < 0)
>>> +		req_set_fail(req);
>>> +	io_req_complete(req, ret);
>>> +}
>>> +EXPORT_SYMBOL_GPL(io_uring_cmd_done);
>>
>> It seems like all callers of io_req_complete actually call req_set_fail
>> on failure.  So maybe it would be nice pre-cleanup to handle the
>> req_set_fail call from ĩo_req_complete?
>
> Interpretation of the result is different, e.g. io_tee(), that was the
> reason it was left in the callers.

Yes, there is about two of them that would then need to be open coded
using __io_req_complete.

>
> [...]
>>> @@ -60,7 +62,10 @@ struct io_uring_sqe {
>>>   		__s32	splice_fd_in;
>>>   		__u32	file_index;
>>>   	};
>>> -	__u64	__pad2[2];
>>> +	union {
>>> +		__u64	__pad2[2];
>>> +		__u64	cmd;
>>> +	};
>>
>> Can someone explain these changes to me a little more?
>
> not required indeed, just
>
> -	__u64	__pad2[2];
> +	__u64	cmd;
> +	__u64	__pad2;

Do we still want a union for cmd and document it to say what
opcode it is for?



More information about the Linux-nvme mailing list