switch block layer polling to a bio based model v4

Jens Axboe axboe at kernel.dk
Tue Oct 12 07:47:29 PDT 2021


On 10/12/21 5:12 AM, Christoph Hellwig wrote:
> Hi all,
> 
> This series clean up the block polling code a bit and changes the interface
> to poll for a specific bio instead of a request_queue and cookie pair.
> 
> Polling for the bio itself leads to a few advantages:
> 
>   - the cookie construction can made entirely private in blk-mq.c
>   - the caller does not need to remember the request_queue and cookie
>     separately and thus sidesteps their lifetime issues
>   - keeping the device and the cookie inside the bio allows to trivially
>     support polling BIOs remapping by stacking drivers
>   - a lot of code to propagate the cookie back up the submission path can
>     removed entirely
> 
> The one major caveat is that this requires RCU freeing polled BIOs to make
> sure the bio that contains the polling information is still alive when
> io_uring tries to poll it through the iocb. For synchronous polling all the
> callers have a bio reference anyway, so this is not an issue.

I ran this through the usual peak testing, and it doesn't seem to regress
anything for me. We're still at around ~7.4M polled IOPS on a single CPU
core:

taskset -c 0,16 t/io_uring -d128 -b512 -s32 -c32 -p1 -F1 -B1 -D1 -n2 /dev/nvme1n1 /dev/nvme2n1
Added file /dev/nvme1n1 (submitter 0)
Added file /dev/nvme2n1 (submitter 1)
polled=1, fixedbufs=1, register_files=1, buffered=0, QD=128
Engine=io_uring, sq_ring=128, cq_ring=256
submitter=0, tid=1199
submitter=1, tid=1200
IOPS=7322112, BW=3575MiB/s, IOS/call=32/31, inflight=(110 71)
IOPS=7452736, BW=3639MiB/s, IOS/call=32/31, inflight=(52 80)
IOPS=7419904, BW=3623MiB/s, IOS/call=32/31, inflight=(78 104)
IOPS=7392576, BW=3609MiB/s, IOS/call=32/32, inflight=(75 102)

with some of my pending changes and hacks. Using IRQ mode, we're at around 4.9M
and I don't see any particular impact of needing deferred RCU free of the bio
for that case.

-- 
Jens Axboe




More information about the Linux-nvme mailing list