[PATCH v5 1/2] blk-mq: add tagset quiesce interface

Paul E. McKenney paulmck at kernel.org
Tue Jul 28 09:54:36 EDT 2020


On Tue, Jul 28, 2020 at 02:24:38AM -0700, Sagi Grimberg wrote:
> 
> > > I like the tagset based interface.  But the idea of doing a per-hctx
> > > allocation and wait doesn't seem very scalable.
> > > 
> > > Paul, do you have any good idea for an interface that waits on
> > > multiple srcu heads?  As far as I can tell we could just have a single
> > > global completion and counter, and each call_srcu would just just
> > > decrement it and then the final one would do the wakeup.  It would just
> > > be great to figure out a way to keep the struct rcu_synchronize and
> > > counter on stack to avoid an allocation.
> > > 
> > > But if we can't do with an on-stack object I'd much rather just embedd
> > > the rcu_head in the hw_ctx.
> > 
> > I think we can do that, please see the following patch which is against Sagi's V5:
> 
> I don't think you can send a single rcu_head to multiple call_srcu calls.

Indeed you cannot.  And if you build with CONFIG_DEBUG_OBJECTS_RCU_HEAD=y
it will yell at you when you try.

You -can- pass on-stack rcu_head structures to call_srcu(), though,
if that helps.  You of course must have some way of waiting for the
callback to be invoked before exiting that function.  This should be
easy for me to package into an API, maybe using one of the existing
reference-counting APIs.

So, do you have a separate stack frame for each of the desired call_srcu()
invocations?  If not, do you know at build time how many rcu_head
structures you need?  If the answer to both of these is "no", then
it is likely that there needs to be an rcu_head in each of the relevant
data structures, as was noted earlier in this thread.

Yeah, I should go read the code.  But I would need to know where it is
and it is still early in the morning over here!  ;-)

I probably should also have read the remainder of the thread before
replying, as well.  But what is the fun in that?

							Thanx, Paul



More information about the Linux-nvme mailing list