[PATCH V4 2/4] blk-mq: implement queue quiesce via percpu_ref for BLK_MQ_F_BLOCKING

Ming Lei ming.lei at redhat.com
Wed Sep 9 21:53:21 EDT 2020


On Wed, Sep 09, 2020 at 09:04:09AM -0700, Keith Busch wrote:
> On Wed, Sep 09, 2020 at 06:41:14PM +0800, Ming Lei wrote:
> >  void blk_mq_quiesce_queue(struct request_queue *q)
> >  {
> > -	struct blk_mq_hw_ctx *hctx;
> > -	unsigned int i;
> > -	bool rcu = false;
> > +	bool blocking = !!(q->tag_set->flags & BLK_MQ_F_BLOCKING);
> > +	bool was_quiesced =__blk_mq_quiesce_queue_nowait(q);
> >  
> > -	__blk_mq_quiesce_queue_nowait(q);
> > +	if (!was_quiesced && blocking)
> > +		percpu_ref_kill(&q->dispatch_counter);
> >  
> > -	queue_for_each_hw_ctx(q, hctx, i) {
> > -		if (hctx->flags & BLK_MQ_F_BLOCKING)
> > -			synchronize_srcu(hctx->srcu);
> > -		else
> > -			rcu = true;
> > -	}
> > -	if (rcu)
> > +	if (blocking)
> > +		wait_event(q->mq_quiesce_wq,
> > +				percpu_ref_is_zero(&q->dispatch_counter));
> > +	else
> >  		synchronize_rcu();
> >  }
> 
> In the previous version, you had ensured no thread can unquiesce a queue
> while another is waiting for quiescence. Now that the locking is gone,
> a thread could unquiesce the queue before percpu_ref reaches zero, so
> the wait_event() may never complete on the resurrected percpu_ref.
> 
> I don't think any drivers do such a thing today, but it just looks like
> the implementation leaves open the possibility.

This driver can cause bigger trouble if it unquiesces its queue which is
being quiesced and still not done.

However, we can avoid that by the following delta change:

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 7669fe815cf9..5632727d71fa 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -225,9 +225,16 @@ static void __blk_mq_quiesce_queue(struct request_queue *q, bool wait)
 	if (!wait)
 		return;
 
+	/*
+	 * In case of F_BLOCKING, if driver unquiesces its queue which is being
+	 * quiesced and still not done, it can cause bigger trouble, and we simply
+	 * return & warn once for avoiding hang here.
+	 */
 	if (blocking)
 		wait_event(q->mq_quiesce_wq,
-				percpu_ref_is_zero(&q->dispatch_counter));
+				percpu_ref_is_zero(&q->dispatch_counter) ||
+				WARN_ON_ONCE(!percpu_ref_is_dying(
+						&q->dispatch_counter)));
 	else
 		synchronize_rcu();
 }

Thanks, 
Ming




More information about the Linux-nvme mailing list