[PATCH 1/2] nvme-tcp: avoid race between nvme scan and reset
Ming Lei
ming.lei at redhat.com
Thu Jun 12 21:56:40 PDT 2025
On Wed, Jun 04, 2025 at 11:17:05AM +0000, Shinichiro Kawasaki wrote:
> Cc+: Ming,
>
> Hi Sagi, thanks for the background explanation and the suggestion.
>
> On Jun 04, 2025 / 10:10, Sagi Grimberg wrote:
> ...
> > > This is a problem. We are flushing a work that is IO bound, which can
> > > take a long time to complete.
> > > Up until now, we have deliberately avoided introducing dependency
> > > between reset forward progress
> > > and scan work IO to completely finish.
> > >
> > > I would like to keep it this way.
> > >
> > > BTW, this is not TCP specific.
>
> I see. The blktests test case nvme/063 is dedicated to tcp transport, so that's
> why it was reported for the TCP transport.
>
> >
> > blk_mq_unquiesce_queue is still very much safe to call as many times as we
> > want.
> > The only thing that comes in the way is this pesky WARN. How about we make
> > it go away and have
> > a debug print instead?
> >
> > My preference would be to allow nvme to unquiesce queues that were not
> > previously quiesced (just
> > like it historically was) instead of having to block a controller reset
> > until the scan_work is completed (which
> > is admin I/O dependent, and may get stuck until admin timeout, which can be
> > changed by the user for 60
> > minutes or something arbitrarily long btw).
> >
> > How about something like this patch instead:
> > --
> > diff --git a/block/blk-mq.c b/block/blk-mq.c
> > index c2697db59109..74f3ad16e812 100644
> > --- a/block/blk-mq.c
> > +++ b/block/blk-mq.c
> > @@ -327,8 +327,10 @@ void blk_mq_unquiesce_queue(struct request_queue *q)
> > bool run_queue = false;
> >
> > spin_lock_irqsave(&q->queue_lock, flags);
> > - if (WARN_ON_ONCE(q->quiesce_depth <= 0)) {
> > - ;
> > + if (q->quiesce_depth <= 0) {
> > + printk(KERN_DEBUG
> > + "dev %s: unquiescing a non-quiesced queue,
> > expected?\n",
> > + q->disk ? q->disk->disk_name : "?", );
> > } else if (!--q->quiesce_depth) {
> > blk_queue_flag_clear(QUEUE_FLAG_QUIESCED, q);
> > run_queue = true;
> > --
>
> The WARN was introduced with the commit e70feb8b3e68 ("blk-mq: support
> concurrent queue quiesce/unquiesce") that Ming authored. Ming, may I
> ask your comment on the suggestion by Sagi?
I think it is bad to use one standard block layer API in this unbalanced way,
that is why WARN_ON_ONCE() is added. We shouldn't encourage driver to use
APIs in this way.
Question is why nvme have to unquiesce one non-quiesced queue?
Thanks,
Ming
More information about the Linux-nvme
mailing list