[PATCH 0/2] blk-mq: fix blk_mq_alloc_request_hctx

Wed Jun 30 16:59:26 PDT 2021

On Wed, Jun 30, 2021 at 09:46:35PM +0200, Hannes Reinecke wrote:
> On 6/30/21 8:59 PM, Sagi Grimberg wrote:
> > 
> > > > > > Shouldn't we rather modify the tagset to only refer to
> > > > > > the current online
> > > > > > CPUs _only_, thereby never submit a connect request for hctx with only
> > > > > > offline CPUs?
> > > > > 
> > > > > Then you may setup very less io queues, and performance may suffer even
> > > > > though lots of CPUs become online later.
> > > > > ;
> > > > Only if we stay with the reduced number of I/O queues. Which is
> > > > not what I'm
> > > > proposing; I'd rather prefer to connect and disconnect queues
> > > > from the cpu
> > > > hotplug handler. For starters we could even trigger a reset once
> > > > the first
> > > > cpu within a hctx is onlined.
> > > 
> > > Yeah, that need one big/complicated patchset, but not see any advantages
> > > over this simple approach.
> > 
> > I tend to agree with Ming here.
> 
> Actually, Daniel and me came to a slightly different idea: use cpu hotplug
> notifier.
> Thing is, blk-mq already has cpu hotplug notifier, which should ensure that
> no I/O is pending during cpu hotplug.

Why should we ensure that for non-managed irq?

> If we now add a nvme cpu hotplug notifier which essentially kicks off a
> reset once all cpu in a hctx are offline the reset logic will rearrange the
> queues to match the current cpu layout.
> And when the cpus are getting onlined we'll do another reset.
> 
> Daniel is currently preparing a patch; let's see how it goes.

What is the advantage of that big change over this simple way?

Thanks, 
Ming