[PATCH V4 2/4] blk-mq: implement queue quiesce via percpu_ref for BLK_MQ_F_BLOCKING

Keith Busch kbusch at kernel.org
Thu Sep 10 10:55:18 EDT 2020


On Thu, Sep 10, 2020 at 09:53:21AM +0800, Ming Lei wrote:
> On Wed, Sep 09, 2020 at 09:04:09AM -0700, Keith Busch wrote:
> > On Wed, Sep 09, 2020 at 06:41:14PM +0800, Ming Lei wrote:
> > >  void blk_mq_quiesce_queue(struct request_queue *q)
> > >  {
> > > -	struct blk_mq_hw_ctx *hctx;
> > > -	unsigned int i;
> > > -	bool rcu = false;
> > > +	bool blocking = !!(q->tag_set->flags & BLK_MQ_F_BLOCKING);
> > > +	bool was_quiesced =__blk_mq_quiesce_queue_nowait(q);
> > >  
> > > -	__blk_mq_quiesce_queue_nowait(q);
> > > +	if (!was_quiesced && blocking)
> > > +		percpu_ref_kill(&q->dispatch_counter);
> > >  
> > > -	queue_for_each_hw_ctx(q, hctx, i) {
> > > -		if (hctx->flags & BLK_MQ_F_BLOCKING)
> > > -			synchronize_srcu(hctx->srcu);
> > > -		else
> > > -			rcu = true;
> > > -	}
> > > -	if (rcu)
> > > +	if (blocking)
> > > +		wait_event(q->mq_quiesce_wq,
> > > +				percpu_ref_is_zero(&q->dispatch_counter));
> > > +	else
> > >  		synchronize_rcu();
> > >  }
> > 
> > In the previous version, you had ensured no thread can unquiesce a queue
> > while another is waiting for quiescence. Now that the locking is gone,
> > a thread could unquiesce the queue before percpu_ref reaches zero, so
> > the wait_event() may never complete on the resurrected percpu_ref.
> > 
> > I don't think any drivers do such a thing today, but it just looks like
> > the implementation leaves open the possibility.
> 
> This driver can cause bigger trouble if it unquiesces its queue which is
> being quiesced and still not done.
> 
> However, we can avoid that by the following delta change:
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 7669fe815cf9..5632727d71fa 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -225,9 +225,16 @@ static void __blk_mq_quiesce_queue(struct request_queue *q, bool wait)
>  	if (!wait)
>  		return;
>  
> +	/*
> +	 * In case of F_BLOCKING, if driver unquiesces its queue which is being
> +	 * quiesced and still not done, it can cause bigger trouble, and we simply
> +	 * return & warn once for avoiding hang here.
> +	 */
>  	if (blocking)
>  		wait_event(q->mq_quiesce_wq,
> -				percpu_ref_is_zero(&q->dispatch_counter));
> +				percpu_ref_is_zero(&q->dispatch_counter) ||
> +				WARN_ON_ONCE(!percpu_ref_is_dying(
> +						&q->dispatch_counter)));
>  	else
>  		synchronize_rcu();
>  }

Yeah, I'm okay with this. A warning and return should be good to
indicate driver sequence errors. So if you just want to fold the above
into this patch, then I don't think I have any remaining concerns.

The other race condition is if unquiesce resurrects the ref before
quiesce kills it, and there's already a warning there too.



More information about the Linux-nvme mailing list