[PATCH 1/5] block: don't call blk_mq_delay_run_hw_queue() in case of BLK_STS_RESOURCE

Tue Sep 19 08:48:23 PDT 2017

On Tue, Sep 19 2017 at  1:43am -0400,
Ming Lei <ming.lei at redhat.com> wrote:

> On Mon, Sep 18, 2017 at 03:18:16PM +0000, Bart Van Assche wrote:
> > On Sun, 2017-09-17 at 20:40 +0800, Ming Lei wrote:
> > > "if no request has completed before the delay has expired" can't be a
> > > reason to rerun the queue, because the queue can still be busy.
> > 
> > That statement of you shows that there are important aspects of the SCSI
> > core and dm-mpath driver that you don't understand.
> 
> Then can you tell me why blk-mq's SCHED_RESTART can't cover
> the rerun when there are in-flight requests? What is the case
> in which dm-rq can return BUSY and there aren't any in-flight
> requests meantime?
> 
> Also you are the author of adding 'blk_mq_delay_run_hw_queue(
> hctx, 100/*ms*/)' in dm-rq, you never explain in commit
> 6077c2d706097c0(dm rq: Avoid that request processing stalls
> sporadically) what the root cause is for your request stall
> and why this patch fixes your issue. Even you don't explain
> why is the delay 100ms?
> 
> So it is a workaound, isn't it?
> 
> My concern is that it isn't good to add blk_mq_delay_run_hw_queue(hctx, 100/*ms*/)
> in the hot path since it should been covered by SCHED_RESTART
> if there are in-flight requests.

This thread proves that it is definitely brittle to be relying on fixed
delays like this:
https://patchwork.kernel.org/patch/9703249/

Mike