[PATCH V4 2/4] blk-mq: implement queue quiesce via percpu_ref for BLK_MQ_F_BLOCKING
Keith Busch
kbusch at kernel.org
Thu Sep 10 10:55:18 EDT 2020
On Thu, Sep 10, 2020 at 09:53:21AM +0800, Ming Lei wrote:
> On Wed, Sep 09, 2020 at 09:04:09AM -0700, Keith Busch wrote:
> > On Wed, Sep 09, 2020 at 06:41:14PM +0800, Ming Lei wrote:
> > > void blk_mq_quiesce_queue(struct request_queue *q)
> > > {
> > > - struct blk_mq_hw_ctx *hctx;
> > > - unsigned int i;
> > > - bool rcu = false;
> > > + bool blocking = !!(q->tag_set->flags & BLK_MQ_F_BLOCKING);
> > > + bool was_quiesced =__blk_mq_quiesce_queue_nowait(q);
> > >
> > > - __blk_mq_quiesce_queue_nowait(q);
> > > + if (!was_quiesced && blocking)
> > > + percpu_ref_kill(&q->dispatch_counter);
> > >
> > > - queue_for_each_hw_ctx(q, hctx, i) {
> > > - if (hctx->flags & BLK_MQ_F_BLOCKING)
> > > - synchronize_srcu(hctx->srcu);
> > > - else
> > > - rcu = true;
> > > - }
> > > - if (rcu)
> > > + if (blocking)
> > > + wait_event(q->mq_quiesce_wq,
> > > + percpu_ref_is_zero(&q->dispatch_counter));
> > > + else
> > > synchronize_rcu();
> > > }
> >
> > In the previous version, you had ensured no thread can unquiesce a queue
> > while another is waiting for quiescence. Now that the locking is gone,
> > a thread could unquiesce the queue before percpu_ref reaches zero, so
> > the wait_event() may never complete on the resurrected percpu_ref.
> >
> > I don't think any drivers do such a thing today, but it just looks like
> > the implementation leaves open the possibility.
>
> This driver can cause bigger trouble if it unquiesces its queue which is
> being quiesced and still not done.
>
> However, we can avoid that by the following delta change:
>
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 7669fe815cf9..5632727d71fa 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -225,9 +225,16 @@ static void __blk_mq_quiesce_queue(struct request_queue *q, bool wait)
> if (!wait)
> return;
>
> + /*
> + * In case of F_BLOCKING, if driver unquiesces its queue which is being
> + * quiesced and still not done, it can cause bigger trouble, and we simply
> + * return & warn once for avoiding hang here.
> + */
> if (blocking)
> wait_event(q->mq_quiesce_wq,
> - percpu_ref_is_zero(&q->dispatch_counter));
> + percpu_ref_is_zero(&q->dispatch_counter) ||
> + WARN_ON_ONCE(!percpu_ref_is_dying(
> + &q->dispatch_counter)));
> else
> synchronize_rcu();
> }
Yeah, I'm okay with this. A warning and return should be good to
indicate driver sequence errors. So if you just want to fold the above
into this patch, then I don't think I have any remaining concerns.
The other race condition is if unquiesce resurrects the ref before
quiesce kills it, and there's already a warning there too.
More information about the Linux-nvme
mailing list