[PATCH 1/5] block: don't call blk_mq_delay_run_hw_queue() in case of BLK_STS_RESOURCE

Mon Sep 18 22:43:13 PDT 2017

On Mon, Sep 18, 2017 at 03:18:16PM +0000, Bart Van Assche wrote:
> On Sun, 2017-09-17 at 20:40 +0800, Ming Lei wrote:
> > "if no request has completed before the delay has expired" can't be a
> > reason to rerun the queue, because the queue can still be busy.
> 
> That statement of you shows that there are important aspects of the SCSI
> core and dm-mpath driver that you don't understand.

Then can you tell me why blk-mq's SCHED_RESTART can't cover
the rerun when there are in-flight requests? What is the case
in which dm-rq can return BUSY and there aren't any in-flight
requests meantime?

Also you are the author of adding 'blk_mq_delay_run_hw_queue(
hctx, 100/*ms*/)' in dm-rq, you never explain in commit
6077c2d706097c0(dm rq: Avoid that request processing stalls
sporadically) what the root cause is for your request stall
and why this patch fixes your issue. Even you don't explain
why is the delay 100ms?

So it is a workaound, isn't it?

My concern is that it isn't good to add blk_mq_delay_run_hw_queue(hctx, 100/*ms*/)
in the hot path since it should been covered by SCHED_RESTART
if there are in-flight requests.

> 
> > I suggest to understand the root cause, instead of keeping this
> > ugly random delay because run hw queue after 100ms may be useless
> > in 99.99% times.
> 
> If you are still looking at removing the blk_mq_delay_run_hw_queue() calls
> then I think you are looking in the wrong direction. What kind of problem
> are you trying to solve? Is it perhaps that there can be a delay between

Actually the improvement on dm-rq IO schedule(the patch 2 ~ 5) doesn't
need this patch.

-- 
Ming