[dm-devel] [PATCH V4] blk-mq: introduce BLK_STS_DEV_RESOURCE
Ming Lei
tom.leiming at gmail.com
Mon Jan 29 21:55:23 PST 2018
On Tue, Jan 30, 2018 at 6:14 AM, Mike Snitzer <snitzer at redhat.com> wrote:
> On Mon, Jan 29 2018 at 4:51pm -0500,
> Bart Van Assche <Bart.VanAssche at wdc.com> wrote:
>
>> On Mon, 2018-01-29 at 16:44 -0500, Mike Snitzer wrote:
>> > But regardless of which wins the race, the queue will have been run.
>> > Which is all we care about right?
>>
>> Running the queue is not sufficient. With this patch applied it can happen
>> that the block driver returns BLK_STS_DEV_RESOURCE, that the two or more
>> concurrent queue runs finish before sufficient device resources are available
>> to execute the request and that blk_mq_delay_run_hw_queue() does not get
>> called at all. If no other activity triggers a queue run, e.g. request
>> completion, this will result in a queue stall.
>
> If BLK_STS_DEV_RESOURCE is returned then the driver doesn't need to rely
> on a future queue run. IIUC, that is the entire premise of
> BLK_STS_DEV_RESOURCE. If the driver had doubt about whether the
> resource were going to be available in the future it should return
> BLK_STS_RESOURCE.
>
> That may seem like putting a lot on a driver developer (to decide
> between the 2) but I'll again defer to Jens here. This was the approach
> he was advocating be pursued.
Thinking of further, maybe you can add the following description in V5,
and it should be much easier for driver developer to follow:
When any resource allocation fails, if driver can make sure that there is
any in-flight IO, it is safe to return BLK_STS_DEV_RESOURCE to blk-mq,
that is exactly what scsi_queue_rq() is doing.
Follows the theory:
1) driver returns BLK_STS_DEV_RESOURCE if driver figures out
there is any in-flight IO, in case of any resource allocation failure
2) If all these in-flight IOs complete before examining SCHED_RESTART in
blk_mq_dispatch_rq_list(), SCHED_RESTART must be cleared, so queue
is run immediately in this case by blk_mq_dispatch_rq_list();
3) if there is any in-flight IO after/when examining SCHED_RESTART in
blk_mq_dispatch_rq_list():
- if SCHED_RESTART isn't set, queue is run immediately as handled in 2)
- otherwise, this request will be dispatched after any in-flight IO is
completed via
blk_mq_sched_restart() since this request is added to hctx->dispatch already
And there are two invariants when driver returns BLK_STS_DEV_RESOURCE
iff there is any in-flight IOs:
1) SCHED_RESTART must be zero if no in-flight IOs
2) there has to be any IO in-flight if SCHED_RESTART is read as 1
Thanks,
Ming
More information about the Linux-nvme
mailing list