[RFC 3/5] io_uring: add infra and support for IORING_OP_URING_CMD
Christoph Hellwig
hch at lst.de
Mon Apr 4 22:58:35 PDT 2022
On Mon, Apr 04, 2022 at 09:20:00AM +0100, Pavel Begunkov wrote:
>> I'm still not a fund of the double indirect call here. I don't really
>> have a good idea yet, but I plan to look into it.
>
> I haven't familiarised myself with the series properly, but if it's about
> driver_cb, we can expose struct io_kiocb and io_req_task_work_add() so
> the lower layers can implement their own io_task_work.func. Hopefully, it
> won't be inventively abused...
If we move io_kiocb out avoiding one indirection would be very easy
indeed. But I think that just invites abuse. Note that we also have
at least one and potentially more indirections in this path. The
request rq_end_io handler is a guranteed one, and the IPI or softirq
for the request indirectin is another one. So my plan was to look
into having an io_uring specific hook in the core block code to
deliver completions directly to the right I/O uring thread. In the
best case that should allow us to do a single indirect call for
the completion instead of 4 and a pointless IPI/softirq.
>>> + struct io_kiocb *req = container_of(ioucmd, struct io_kiocb, uring_cmd);
>>> +
>>> + if (ret < 0)
>>> + req_set_fail(req);
>>> + io_req_complete(req, ret);
>>> +}
>>> +EXPORT_SYMBOL_GPL(io_uring_cmd_done);
>>
>> It seems like all callers of io_req_complete actually call req_set_fail
>> on failure. So maybe it would be nice pre-cleanup to handle the
>> req_set_fail call from ĩo_req_complete?
>
> Interpretation of the result is different, e.g. io_tee(), that was the
> reason it was left in the callers.
Yes, there is about two of them that would then need to be open coded
using __io_req_complete.
>
> [...]
>>> @@ -60,7 +62,10 @@ struct io_uring_sqe {
>>> __s32 splice_fd_in;
>>> __u32 file_index;
>>> };
>>> - __u64 __pad2[2];
>>> + union {
>>> + __u64 __pad2[2];
>>> + __u64 cmd;
>>> + };
>>
>> Can someone explain these changes to me a little more?
>
> not required indeed, just
>
> - __u64 __pad2[2];
> + __u64 cmd;
> + __u64 __pad2;
Do we still want a union for cmd and document it to say what
opcode it is for?
More information about the Linux-nvme
mailing list