[PATCH 1/5] block: enable batched allocation for blk_mq_alloc_request()

Damien Le Moal damien.lemoal at opensource.wdc.com
Fri Sep 23 18:22:18 PDT 2022


On 9/24/22 10:01, Jens Axboe wrote:
> On 9/23/22 6:59 PM, Damien Le Moal wrote:
>> On 9/24/22 05:54, Jens Axboe wrote:
>>> On 9/23/22 9:13 AM, Pankaj Raghav wrote:
>>>> On 2022-09-23 16:52, Pankaj Raghav wrote:
>>>>> On Thu, Sep 22, 2022 at 12:28:01PM -0600, Jens Axboe wrote:
>>>>>> The filesystem IO path can take advantage of allocating batches of
>>>>>> requests, if the underlying submitter tells the block layer about it
>>>>>> through the blk_plug. For passthrough IO, the exported API is the
>>>>>> blk_mq_alloc_request() helper, and that one does not allow for
>>>>>> request caching.
>>>>>>
>>>>>> Wire up request caching for blk_mq_alloc_request(), which is generally
>>>>>> done without having a bio available upfront.
>>>>>>
>>>>>> Signed-off-by: Jens Axboe <axboe at kernel.dk>
>>>>>> ---
>>>>>>  block/blk-mq.c | 80 ++++++++++++++++++++++++++++++++++++++++++++------
>>>>>>  1 file changed, 71 insertions(+), 9 deletions(-)
>>>>>>
>>>>> I think we need this patch to ensure correct behaviour for passthrough:
>>>>>
>>>>> diff --git a/block/blk-mq.c b/block/blk-mq.c
>>>>> index c11949d66163..840541c1ab40 100644
>>>>> --- a/block/blk-mq.c
>>>>> +++ b/block/blk-mq.c
>>>>> @@ -1213,7 +1213,7 @@ void blk_execute_rq_nowait(struct request *rq, bool at_head)
>>>>>         WARN_ON(!blk_rq_is_passthrough(rq));
>>>>>  
>>>>>         blk_account_io_start(rq);
>>>>> -       if (current->plug)
>>>>> +       if (blk_mq_plug(rq->bio))
>>>>>                 blk_add_rq_to_plug(current->plug, rq);
>>>>>         else
>>>>>                 blk_mq_sched_insert_request(rq, at_head, true, false);
>>>>>
>>>>> As the passthrough path can now support request caching via blk_mq_alloc_request(),
>>>>> and it uses blk_execute_rq_nowait(), bad things can happen at least for zoned
>>>>> devices:
>>>>>
>>>>> static inline struct blk_plug *blk_mq_plug( struct bio *bio)
>>>>> {
>>>>> 	/* Zoned block device write operation case: do not plug the BIO */
>>>>> 	if (bdev_is_zoned(bio->bi_bdev) && op_is_write(bio_op(bio)))
>>>>> 		return NULL;
>>>>> ..
>>>>
>>>> Thinking more about it, even this will not fix it because op is
>>>> REQ_OP_DRV_OUT if it is a NVMe write for passthrough requests.
>>>>
>>>> @Damien Should the condition in blk_mq_plug() be changed to:
>>>>
>>>> static inline struct blk_plug *blk_mq_plug( struct bio *bio)
>>>> {
>>>> 	/* Zoned block device write operation case: do not plug the BIO */
>>>> 	if (bdev_is_zoned(bio->bi_bdev) && !op_is_read(bio_op(bio)))
>>>> 		return NULL;
>>>
>>> That looks reasonable to me. It'll prevent plug optimizations even
>>> for passthrough on zoned devices, but that's probably fine.
>>
>> Could do:
>>
>> 	if (blk_op_is_passthrough(bio_op(bio)) ||
>> 	    (bdev_is_zoned(bio->bi_bdev) && op_is_write(bio_op(bio))))
>> 		return NULL;
>>
>> Which I think is way cleaner. No ?
>> Unless you want to preserve plugging with passthrough commands on regular
>> (not zoned) drives ?
> 
> We most certainly do, without plugging this whole patchset is not
> functional. Nor is batched dispatch, for example.

OK. Then the change to !op_is_read() is fine then.

> 

-- 
Damien Le Moal
Western Digital Research




More information about the Linux-nvme mailing list