[PATCH] irqchip/sifive-plic: Fix plic_set_affinity() only enables 1 cpu

Sun Jul 7 07:34:14 PDT 2024

On Wed, Jul 03, 2024 at 08:31:31PM +0530, Anup Patel wrote:
> On Wed, Jul 3, 2024 at 6:03 PM Nam Cao <namcao at linutronix.de> wrote:
> >
> > On Wed, Jul 03, 2024 at 05:28:23PM +0530, Anup Patel wrote:
> > > On Wed, Jul 3, 2024 at 12:57 PM Nam Cao <namcao at linutronix.de> wrote:
> > > >
> > > > plic_set_affinity() only enables interrupt for the first possible CPU in
> > > > the mask. The point is to prevent all CPUs trying to claim an interrupt,
> > > > but only one CPU succeeds and the other CPUs wasted some clock cycles for
> > > > nothing.
> > > >
> > > > However, there are two problems with that:
> > > > 1. Users cannot enable interrupt on multiple CPUs (for example, to minimize
> > > > interrupt latency).
> > >
> > > Well, you are assuming that multiple CPUs are always idle or available
> > > to process interrupts. In other words, if the system is loaded running
> > > some workload on each CPU then performance on multiple CPUs
> > > will degrade since multiple CPUs will wastefully try to claim interrupt.
> > >
> > > In reality, we can't make such assumptions and it is better to target a
> > > particular CPU for processing interrupts (just like various other interrupt
> > > controllers). For balancing interrupt processing load, we have software
> > > irq balancers running in user-space (or kernel space) which do a
> > > reasonably fine job of picking appropriate CPU for interrupt processing.
> >
> > Then we should leave the job of distributing interrupts to those tools,
> > right? Not all use cases want minimally wasted CPU cycles. For example, if
> > a particular interrupt does not arrive very often, but when it does, it
> > needs to be handled fast; in this example, clearly enabling this interrupt
> > for all CPUs is superior.
> 
> This is a very specific case which you are trying to optimize and in the
> process hurting performance in many other cases. There are many high
> speed IOs (network, storage, etc) where rate of interrupt is high so for
> such IO your patch will degrade performance on multiple CPUs.

No, it wouldn't "hurting performance in many other cases". It would give
users the ability to do what they want, including hurting performance as
you said, or improving performance as I pointed out earlier.

I am willing to bet that most users don't ever touch this. But if they do,
they better know what they are doing. If they want to waste their CPU
cycles, so be it.

My point essentially is that kernel shouldn't force any policy on users.
The only case this makes sense is when the policy is _strictly_ better than
anything else, which is not true here. What the driver should do is
providing a "good enough for most" default, but still let users decide
what's best for them.

Side note: if I am not mistaken, the effective affinity mask thing is for
hardware limitation of the chips who cannot enable interrupt for all CPUs
in the mask. RISC-V PLIC, on the other hand, can enable interrupts for any
CPU, and therefore should do so.

Best regards,
Nam