Why GICD_ITARGETSR is not used by Linux

Tue Sep 20 08:37:15 PDT 2022

Hi Russell,

 ---- On Tue, 20 Sep 2022 12:09:38 +0200  Russell King (Oracle)  wrote --- 
 > On Tue, Sep 20, 2022 at 11:45:10AM +0200, Li Chen wrote:
 > > Hi Arnd,
 > > 
 > >  ---- On Tue, 20 Sep 2022 09:04:16 +0200  Arnd Bergmann  wrote --- 
 > >  > On Tue, Sep 20, 2022, at 3:42 AM, Li Chen wrote:
 > >  > > Hi Arnd,
 > >  > >
 > >  > > I noticed GIC has GICD_ITARGETSR to distribute IRQ to different CPUs, 
 > >  > > but currently, it is not used by Linux.
 > >  > >
 > >  > > There was a patchset from MTK people: 
 > >  > > http://archive.lwn.net:8080/linux-kernel/1606486531-25719-1-git-send-email-hanks.chen@mediatek.com/T/#t 
 > >  > > which implements GIC-level IRQ distributor using GICD_ITARGETSR, but it 
 > >  > > is 
 > >  > > not accepted because the maintainer thinks it will break existing codes 
 > >  > > and not provide benefits compared with the existing affinity mechanism.
 > >  > >
 > >  > > IIUC, Linux only relies on affinity/irqbalance to distribute IRQ 
 > >  > > instead of architecture-specific solutions like GIC's distributor.
 > >  > >
 > >  > > Maybe latency can somewhat get improved, but there is no benchmark yet.
 > >  > >
 > >  > > I have two questions here:
 > >  > > 1. Now that Linux doesn't use GICD_ITARGETSR, where does it set CPU 0 
 > >  > > to be the only IRQ distributor core?
 > >  > > 2. Do you know any other reasons that GICD_ITARGETSR is not used by 
 > >  > > Linux?
 > >  > 
 > >  > Hi Li,
 > >  > 
 > >  > It looks like the original submitter never followed up
 > >  > with a new version of the patch that addresses the
 > >  > issues found in review. I would assume they gave up either
 > >  > because it did not show any real-world advantage, or they
 > >  > could not address all of the concerns.
 > > 
 > > Thanks for your reply.
 > > 
 > > FYI, here is another thread about this topic: https://lore.kernel.org/linux-arm-kernel/20191120105017.GN25745@shell.armlinux.org.uk/
 > 
 > Oh god, not this again.
 > 
 > The behaviour of the GIC is as follows. If you set two CPUs in
 > GICD_ITARGETSRn, then the interrupt will be delivered to _both_ of
 > those CPUs. Not just one selected at random or determined by some
 > algorithm, but both CPUs.
 > 
 > Both CPUs get woken up if they're in sleep, and both CPUs attempt to
 > process the interrupt. One CPU will win the lock, while the other CPU
 > spins waiting for the lock to process the interrupt.
 > 
 > The winning CPU will process the interrupt, clear it on the device,
 > release the lock and acknowledge it at the GIC CPU interface.
 > 
 > The CPU that lost the previous race can now proceed to process the
 > very same interrupt, discovers that it's no longer pending on the
 > device, and signals IRQ_NONE as it appears to be a spurious interrupt.
 > 
 > The result is that the losing CPU ends up wasting CPU cycles, and
 > if the losing CPU was in a low power idle state, needlessly wakes up
 > to process this interrupt.
 > 
 > If you have more CPUs involved, you have more CPUs wasting CPU cycles,
 > being woken up wasting power - not just occasionally, but almost every
 > single interrupt that is raised from a device in the system.

Thank you very much for your explanation, it sounds like GICD_ITARGETSRn 
is a very useless feature, and it is not available in GIC-600 spec anymore.

 > On architectures such as x86, the PICs distribute the interrupts in
 > hardware amongst the CPUs. So if a single interrupt is set to be sent
 > to multiple CPUs, only _one_ of the CPUs is actually interrupted.
 > Hence, x86 can have multiple CPUs selected as a destination, and
 > the hardware delivers the interrupt across all CPUs.

I had some experiences with alpha-like architecture, which binds MSI to
different cores and I think only one will get interrupted too, but I didn't
know how did they do it.

 > On ARM, we don't have that. We have a thundering herd of CPUs if we
 > set more than one CPU to process the interrupt, which is grossly
 > inefficient.

So, on arm chips with PCIe controller(s), we also rely on irqbalance to 
distribute endpoints' legacy irq/MSI/MSIx?

 > As I said in the reply you linked to above, I did attempt to implement
 > several ideas in software, where the kernel would attempt to distribute
 > dynamically the interrupt amongst the CPUs in the affinity mask, but I
 > could never get what appeared to be a good behaviour on the platforms
 > I was trying and performance wasn't as good. So I abandoned it.
 > 
 > This doesn't preclude someone else having a go at solving that problem,
 > but the problem is not solved by setting multiple CPU bits in the
 > GICD_ITARGETSRn registers. As I said above, that just gets you a
 > thundering herd problem, less performance, and worse power consumption.
 > 
 > -- 
 > RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
 > FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!
 >