[PATCH v2 3/3] lib/group_cpus.c: honor housekeeping config when grouping CPUs

Ming Lei ming.lei at redhat.com
Mon Jul 1 02:16:11 PDT 2024


On Mon, Jul 01, 2024 at 10:43:14AM +0200, Hannes Reinecke wrote:
> On 7/1/24 09:21, Ming Lei wrote:
> > On Mon, Jul 01, 2024 at 09:08:32AM +0200, Daniel Wagner wrote:
> > > On Sun, Jun 30, 2024 at 09:39:59PM GMT, Ming Lei wrote:
> > > > > Make group_cpus_evenly aware of isolcpus configuration and use the
> > > > > housekeeping CPU mask as base for distributing the available CPUs into
> > > > > groups.
> > > > > 
> > > > > Fixes: 11ea68f553e2 ("genirq, sched/isolation: Isolate from handling managed interrupts")
> > > > 
> > > > isolated CPUs are actually handled when figuring out irq effective mask,
> > > > so not sure how commit 11ea68f553e2 is wrong, and what is fixed in this
> > > > patch from user viewpoint?
> > > 
> > > IO queues are allocated/spread on the isolated CPUs and if there is an
> > > thread submitting IOs from an isolated CPU it will cause noise on the
> > > isolated CPUs. The question is this a use case you need/want to support?
> > 
> > I have talked RH Openshift team weeks ago and they have such usage.
> > 
> > userspace is free to run any application from isolated CPUs via 'taskset
> > -c' even though 'isolcpus=' is passed from command line.
> > 
> > Kernel can not add such new constraint on userspace.
> > 
> > > We have customers who are complaining that even with isolcpus provided
> > > they still see IO noise on the isolated CPUs.
> > 
> > That is another issue, which has been fixed by the following patch:
> > 
> > a46c27026da1 blk-mq: don't schedule block kworker on isolated CPUs
> > 
> Hmm. Just when I thought I understood the issue ...
> 
> How is this supposed to work, then, given that I/O can be initiated
> from the isolated CPUs?
> I would have accepted that we have two scheduling domains, blk-mq is
> spread across all cpus, and the blk-mq cpusets are arranged according
> to the isolcpu settings.
> Then we can initiate I/O from the isolated cpus, and the scheduler
> would 'magically' ensure that everything is only run on isolated cpus.

blk-mq issues IO either from current context or kblockd context.

> 
> But that patch would completely counteract such a setup, as during
> I/O we more often than not will invoke kblockd, which then would cause
> cross-talk on non-isolated cpus.

If IO is submitted from isolated CPU, blk-mq will issue this IO via
unbound kblockd WQ, which is guaranteed to not run on isolated CPUs.


Thanks,
Ming




More information about the Linux-nvme mailing list