[PATCH 1/2] blk-mq-sched: Allocate sched reserved tags as specified in the original queue tagset
Jens Axboe
axboe at kernel.dk
Mon Feb 27 08:17:33 PST 2017
On 02/27/2017 09:14 AM, Omar Sandoval wrote:
> On Mon, Feb 27, 2017 at 06:10:01PM +0200, Sagi Grimberg wrote:
>>
>>>> Hm, this may fix the crash, but I'm not sure it'll work as intended.
>>>> When we allocate the request, we'll get a reserved scheduler tag, but
>>>> then when we go to dispatch the request and call
>>>> blk_mq_get_driver_tag(), we'll be competing with all of the normal
>>>> requests for a regular driver tag. So maybe on top of this we should add
>>>> the BLK_MQ_REQ_RESERVED flag to the allocation attempt in
>>>> blk_mq_get_driver_tag() if the scheduler tag is reserved? I'm hazy on
>>>> what we expect from reserved tags, so feel free to call me crazy.
>>>
>>> Yeah good point, we need to carry it through. Reserved tags exist
>>> because drivers often need a request/tag for error handling. If all
>>> tags currently are used up for regular IO that is stuck, you need
>>> a reserved tag for error handling to guarantee progress.
>>>
>>> So Sagi's patch does take it half the way there, but get_driver_tag
>>> really needs to know about this as well, or we will just get stuck
>>> there as well. Two solutions, I can think of:
>>>
>>> 1) Check the tag value in get_driver_tag, add BLK_MQ_REQ_RESERVED
>>> when allocating a driver tag if above X.
>>> 2) Add an RQF_SOMETHING_RESERVED. Add BLK_MQ_REQ_RESERVED in
>>> get_driver_tag if that is set.
>>>
>>> Comments?
>
> Option 1 looks simple enough that I don't think it warrants a new
> request flag (compile tested only):
>
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 9e6b064e5339..87590f7d4f80 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -852,6 +852,9 @@ bool blk_mq_get_driver_tag(struct request *rq, struct blk_mq_hw_ctx **hctx,
> return true;
> }
>
> + if (rq->internal_tag < data.hctx->sched_tags->nr_reserved_tags)
> + data.flags |= BLK_MQ_REQ_RESERVED;
> +
> rq->tag = blk_mq_get_tag(&data);
> if (rq->tag >= 0) {
> if (blk_mq_tag_busy(data.hctx)) {
Agree, that's identical to what I just sent out as well, functionally.
>> Can't we just not go through the scheduler for reserved tags? Obviously
>> there is no point in scheduling them...
>
> That sounds nice, since I'd be worried about the scheduler also needing
> to be aware of the reserved status lest it also get the reserved request
> stuck behind some normal requests. But, we special case flush in this
> way, and it has been a huge pain.
The caller better be using head insertion for this, in case we already
have requests in the queue. But that's no different than the current
logic.
So I think it should work fine.
--
Jens Axboe
More information about the Linux-nvme
mailing list