[RFC 01/13] io_uring: add infra for uring_cmd completion in submitter-task
Jens Axboe
axboe at kernel.dk
Thu Feb 17 07:50:59 PST 2022
On 2/17/22 8:39 AM, Kanchan Joshi wrote:
> On Thu, Feb 17, 2022 at 7:43 AM Luis Chamberlain <mcgrof at kernel.org> wrote:
>>
>> On Mon, Dec 20, 2021 at 07:47:22PM +0530, Kanchan Joshi wrote:
>>> Completion of a uring_cmd ioctl may involve referencing certain
>>> ioctl-specific fields, requiring original submitter context.
>>> Export an API that driver can use for this purpose.
>>> The API facilitates reusing task-work infra of io_uring, while driver
>>> gets to implement cmd-specific handling in a callback.
>>>
>>> Signed-off-by: Kanchan Joshi <joshi.k at samsung.com>
>>> Signed-off-by: Anuj Gupta <anuj20.g at samsung.com>
>>> ---
>>> fs/io_uring.c | 16 ++++++++++++++++
>>> include/linux/io_uring.h | 8 ++++++++
>>> 2 files changed, 24 insertions(+)
>>>
>>> diff --git a/fs/io_uring.c b/fs/io_uring.c
>>> index e96ed3d0385e..246f1085404d 100644
>>> --- a/fs/io_uring.c
>>> +++ b/fs/io_uring.c
>>> @@ -2450,6 +2450,22 @@ static void io_req_task_submit(struct io_kiocb *req, bool *locked)
>>> io_req_complete_failed(req, -EFAULT);
>>> }
>>>
>>> +static void io_uring_cmd_work(struct io_kiocb *req, bool *locked)
>>> +{
>>> + req->uring_cmd.driver_cb(&req->uring_cmd);
>>
>> If the callback memory area is gone, boom.
>
> Why will the memory area be gone?
> Module removal is protected because try_module_get is done anyway when
> the namespace was opened.
And the req isn't going away before it's completed.
>>> +{
>>> + struct io_kiocb *req = container_of(ioucmd, struct io_kiocb, uring_cmd);
>>> +
>>> + req->uring_cmd.driver_cb = driver_cb;
>>> + req->io_task_work.func = io_uring_cmd_work;
>>> + io_req_task_work_add(req, !!(req->ctx->flags & IORING_SETUP_SQPOLL));
>>
>> This can schedules, and so the callback may go fishing in the meantime.
>
> io_req_task_work_add is safe to be called in atomic context. FWIW,
> io_uring uses this for regular (i.e. direct block) io completion too.
Correct, it doesn't schedule and is safe from irq context as long as the
task is pinned (which it is, via the req itself).
--
Jens Axboe
More information about the Linux-nvme
mailing list