Why GICD_ITARGETSR is not used by Linux

Russell King (Oracle) linux at armlinux.org.uk
Tue Sep 20 03:09:38 PDT 2022


On Tue, Sep 20, 2022 at 11:45:10AM +0200, Li Chen wrote:
> Hi Arnd,
> 
>  ---- On Tue, 20 Sep 2022 09:04:16 +0200  Arnd Bergmann  wrote --- 
>  > On Tue, Sep 20, 2022, at 3:42 AM, Li Chen wrote:
>  > > Hi Arnd,
>  > >
>  > > I noticed GIC has GICD_ITARGETSR to distribute IRQ to different CPUs, 
>  > > but currently, it is not used by Linux.
>  > >
>  > > There was a patchset from MTK people: 
>  > > http://archive.lwn.net:8080/linux-kernel/1606486531-25719-1-git-send-email-hanks.chen@mediatek.com/T/#t 
>  > > which implements GIC-level IRQ distributor using GICD_ITARGETSR, but it 
>  > > is 
>  > > not accepted because the maintainer thinks it will break existing codes 
>  > > and not provide benefits compared with the existing affinity mechanism.
>  > >
>  > > IIUC, Linux only relies on affinity/irqbalance to distribute IRQ 
>  > > instead of architecture-specific solutions like GIC's distributor.
>  > >
>  > > Maybe latency can somewhat get improved, but there is no benchmark yet.
>  > >
>  > > I have two questions here:
>  > > 1. Now that Linux doesn't use GICD_ITARGETSR, where does it set CPU 0 
>  > > to be the only IRQ distributor core?
>  > > 2. Do you know any other reasons that GICD_ITARGETSR is not used by 
>  > > Linux?
>  > 
>  > Hi Li,
>  > 
>  > It looks like the original submitter never followed up
>  > with a new version of the patch that addresses the
>  > issues found in review. I would assume they gave up either
>  > because it did not show any real-world advantage, or they
>  > could not address all of the concerns.
> 
> Thanks for your reply.
> 
> FYI, here is another thread about this topic: https://lore.kernel.org/linux-arm-kernel/20191120105017.GN25745@shell.armlinux.org.uk/

Oh god, not this again.

The behaviour of the GIC is as follows. If you set two CPUs in
GICD_ITARGETSRn, then the interrupt will be delivered to _both_ of
those CPUs. Not just one selected at random or determined by some
algorithm, but both CPUs.

Both CPUs get woken up if they're in sleep, and both CPUs attempt to
process the interrupt. One CPU will win the lock, while the other CPU
spins waiting for the lock to process the interrupt.

The winning CPU will process the interrupt, clear it on the device,
release the lock and acknowledge it at the GIC CPU interface.

The CPU that lost the previous race can now proceed to process the
very same interrupt, discovers that it's no longer pending on the
device, and signals IRQ_NONE as it appears to be a spurious interrupt.

The result is that the losing CPU ends up wasting CPU cycles, and
if the losing CPU was in a low power idle state, needlessly wakes up
to process this interrupt.

If you have more CPUs involved, you have more CPUs wasting CPU cycles,
being woken up wasting power - not just occasionally, but almost every
single interrupt that is raised from a device in the system.

On architectures such as x86, the PICs distribute the interrupts in
hardware amongst the CPUs. So if a single interrupt is set to be sent
to multiple CPUs, only _one_ of the CPUs is actually interrupted.
Hence, x86 can have multiple CPUs selected as a destination, and
the hardware delivers the interrupt across all CPUs.

On ARM, we don't have that. We have a thundering herd of CPUs if we
set more than one CPU to process the interrupt, which is grossly
inefficient.

As I said in the reply you linked to above, I did attempt to implement
several ideas in software, where the kernel would attempt to distribute
dynamically the interrupt amongst the CPUs in the affinity mask, but I
could never get what appeared to be a good behaviour on the platforms
I was trying and performance wasn't as good. So I abandoned it.

This doesn't preclude someone else having a go at solving that problem,
but the problem is not solved by setting multiple CPU bits in the
GICD_ITARGETSRn registers. As I said above, that just gets you a
thundering herd problem, less performance, and worse power consumption.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!



More information about the linux-arm-kernel mailing list