switch block layer polling to a bio based model v4

Sagi Grimberg sagi at grimberg.me
Tue Oct 12 08:09:20 PDT 2021



On 10/12/21 5:58 PM, Jens Axboe wrote:
> On 10/12/21 8:57 AM, Sagi Grimberg wrote:
>>
>>>> Hi all,
>>>>
>>>> This series clean up the block polling code a bit and changes the interface
>>>> to poll for a specific bio instead of a request_queue and cookie pair.
>>>>
>>>> Polling for the bio itself leads to a few advantages:
>>>>
>>>>     - the cookie construction can made entirely private in blk-mq.c
>>>>     - the caller does not need to remember the request_queue and cookie
>>>>       separately and thus sidesteps their lifetime issues
>>>>     - keeping the device and the cookie inside the bio allows to trivially
>>>>       support polling BIOs remapping by stacking drivers
>>>>     - a lot of code to propagate the cookie back up the submission path can
>>>>       removed entirely
>>>>
>>>> The one major caveat is that this requires RCU freeing polled BIOs to make
>>>> sure the bio that contains the polling information is still alive when
>>>> io_uring tries to poll it through the iocb. For synchronous polling all the
>>>> callers have a bio reference anyway, so this is not an issue.
>>>
>>> I ran this through the usual peak testing, and it doesn't seem to regress
>>> anything for me. We're still at around ~7.4M polled IOPS on a single CPU
>>> core:
>>>
>>> taskset -c 0,16 t/io_uring -d128 -b512 -s32 -c32 -p1 -F1 -B1 -D1 -n2 /dev/nvme1n1 /dev/nvme2n1
>>> Added file /dev/nvme1n1 (submitter 0)
>>> Added file /dev/nvme2n1 (submitter 1)
>>> polled=1, fixedbufs=1, register_files=1, buffered=0, QD=128
>>> Engine=io_uring, sq_ring=128, cq_ring=256
>>> submitter=0, tid=1199
>>> submitter=1, tid=1200
>>> IOPS=7322112, BW=3575MiB/s, IOS/call=32/31, inflight=(110 71)
>>> IOPS=7452736, BW=3639MiB/s, IOS/call=32/31, inflight=(52 80)
>>> IOPS=7419904, BW=3623MiB/s, IOS/call=32/31, inflight=(78 104)
>>> IOPS=7392576, BW=3609MiB/s, IOS/call=32/32, inflight=(75 102)
>>
>> Jens, is that with nvme_core.multipath=Y ?
> 
> No, I don't have multipath enabled. I can run that too, if you'd like.

That would be useful to learn. thanks.



More information about the Linux-nvme mailing list