[LSF/MM/BPF ATTEND][LSF/MM/BPF Topic] Non-block IO

Jens Axboe axboe at kernel.dk
Tue Apr 11 15:53:43 PDT 2023


On 4/11/23 4:48 PM, Kanchan Joshi wrote:
>>> 4. Direct NVMe queues - will there be interest in having io_uring
>>> managed NVMe queues?  Sort of a new ring, for which I/O is destaged from
>>> io_uring SQE to NVMe SQE without having to go through intermediate
>>> constructs (i.e., bio/request). Hopefully,that can further amp up the
>>> efficiency of IO.
>>
>> This is interesting, and I've pondered something like that before too. I
>> think it's worth investigating and hacking up a prototype. I recently
>> had one user of IOPOLL assume that setting up a ring with IOPOLL would
>> automatically create a polled queue on the driver side and that is what
>> would be used for IO. And while that's not how it currently works, it
>> definitely does make sense and we could make some things faster like
>> that. It would also potentially easier enable cancelation referenced in
>> #1 above, if it's restricted to the queue(s) that the ring "owns".
> 
> So I am looking at prototyping it, exclusively for the polled-io case.
> And for that, is there already a way to ensure that there are no
> concurrent submissions to this ring (set with IORING_SETUP_IOPOLL
> flag)?
> That will be the case generally (and submissions happen under
> uring_lock mutex), but submission may still get punted to io-wq
> worker(s) which do not take that mutex.
> So the original task and worker may get into doing concurrent submissions.

io-wq may indeed get in your way. But I think for something like this,
you'd never want to punt to io-wq to begin with. If userspace is managing
the queue, then by definition you cannot run out of tags. If there are
other conditions for this kind of request that may run into out-of-memory
conditions, then the error just needs to be returned.

With that, you have exclusive submits on that ring and lower down.

> The flag IORING_SETUP_SINGLE_ISSUER - is not for this case, or is it?

It's not, it enables optimizations around the ring creator saying that
only one userspace task is submitting requests on this ring.

-- 
Jens Axboe





More information about the Linux-nvme mailing list