[PATCH 2/3] arm64: smp: Implement cpus_has_pending_ipi()

Fri Oct 10 01:30:11 PDT 2025

On Mon, 6 Oct 2025 at 17:55, Marc Zyngier <maz at kernel.org> wrote:
>
> On Fri, 03 Oct 2025 16:02:44 +0100,
> Ulf Hansson <ulf.hansson at linaro.org> wrote:
> >
> > To add support for keeping track of whether there may be a pending IPI
> > scheduled for a CPU or a group of CPUs, let's implement
> > cpus_has_pending_ipi() for arm64.
> >
> > Note, the implementation is intentionally lightweight and doesn't use any
> > additional lock. This is good enough for cpuidle based decisions.
> >
> > Signed-off-by: Ulf Hansson <ulf.hansson at linaro.org>
> > ---
> >  arch/arm64/kernel/smp.c | 20 ++++++++++++++++++++
> >  1 file changed, 20 insertions(+)
> >
> > diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
> > index 68cea3a4a35c..dd1acfa91d44 100644
> > --- a/arch/arm64/kernel/smp.c
> > +++ b/arch/arm64/kernel/smp.c
> > @@ -55,6 +55,8 @@
> >
> >  #include <trace/events/ipi.h>
> >
> > +static DEFINE_PER_CPU(bool, pending_ipi);
> > +
> >  /*
> >   * as from 2.5, kernels no longer have an init_tasks structure
> >   * so we need some other way of telling a new secondary core
> > @@ -1012,6 +1014,8 @@ static void do_handle_IPI(int ipinr)
> >
> >       if ((unsigned)ipinr < NR_IPI)
> >               trace_ipi_exit(ipi_types[ipinr]);
> > +
> > +     per_cpu(pending_ipi, cpu) = false;
> >  }
> >
> >  static irqreturn_t ipi_handler(int irq, void *data)
> > @@ -1024,10 +1028,26 @@ static irqreturn_t ipi_handler(int irq, void *data)
> >
> >  static void smp_cross_call(const struct cpumask *target, unsigned int ipinr)
> >  {
> > +     unsigned int cpu;
> > +
> > +     for_each_cpu(cpu, target)
> > +             per_cpu(pending_ipi, cpu) = true;
> > +
>
> Why isn't all of this part of the core IRQ management? We already
> track things like timers, I assume for similar reasons. If IPIs have
> to be singled out, I'd rather this is done in common code, and not on
> a per architecture basis.

The idea was to start simple, avoid running code for architectures
that don't seem to need it, by using this opt-in and lightweight
approach.

I guess we could do this in generic IRQ code too. Perhaps making it
conditional behind a Kconfig, if required.

>
> >       trace_ipi_raise(target, ipi_types[ipinr]);
> >       arm64_send_ipi(target, ipinr);
> >  }
> >
> > +bool cpus_has_pending_ipi(const struct cpumask *mask)
> > +{
> > +     unsigned int cpu;
> > +
> > +     for_each_cpu(cpu, mask) {
> > +             if (per_cpu(pending_ipi, cpu))
> > +                     return true;
> > +     }
> > +     return false;
> > +}
> > +
>
> The lack of memory barriers makes me wonder how reliable this is.
> Maybe this is relying on the IPIs themselves acting as such, but
> that's extremely racy no matter how you look at it.

It's deliberately lightweight. I am worried about introducing
locking/barriers, as those could be costly and introduce latencies in
these paths.

Still this is good enough to significantly improve cpuidle based
decisions in this regard. Please have a look at the commit message of
patch3.

That said, for sure I am open to suggestions on how to improve the
"racyness", while still keeping it lightweight.

Kind regards
Uffe