[PATCH v8 11/16] irqchip/gic-v3: Add support for ACPI's disabled but 'online capable' CPUs

Jonathan Cameron Jonathan.Cameron at Huawei.com
Mon Apr 29 02:21:31 PDT 2024


On Sun, 28 Apr 2024 12:28:03 +0100
Marc Zyngier <maz at kernel.org> wrote:

> On Fri, 26 Apr 2024 19:28:58 +0100,
> Jonathan Cameron <Jonathan.Cameron at Huawei.com> wrote:
> > 
> > 
> > I'll not send a formal v9 until early next week, so here is the current state
> > if you have time to take another look before then.  
> 
> Don't bother resending this on my account -- you only sent it on
> Friday and there hasn't been much response to it yet. There is still a
> problem (see below), but looks otherwise OK.
> 
> [...]
> 
> > @@ -2363,11 +2381,25 @@ gic_acpi_parse_madt_gicc(union acpi_subtable_headers *header,
> >  				(struct acpi_madt_generic_interrupt *)header;
> >  	u32 reg = readl_relaxed(acpi_data.dist_base + GICD_PIDR2) & GIC_PIDR2_ARCH_MASK;
> >  	u32 size = reg == GIC_PIDR2_ARCH_GICv4 ? SZ_64K * 4 : SZ_64K * 2;
> > +	int cpu = get_cpu_for_acpi_id(gicc->uid);  
> 
> I already commented that get_cpu_for_acpi_id() can...

Indeed sorry - I blame Friday syndrome for me failing to address that.

> 
> >  	void __iomem *redist_base;
> >  
> > -	if (!acpi_gicc_is_usable(gicc))
> > +	/* Neither enabled or online capable means it doesn't exist, skip it */
> > +	if (!(gicc->flags & (ACPI_MADT_ENABLED | ACPI_MADT_GICC_ONLINE_CAPABLE)))
> >  		return 0;
> >  
> > +	/*
> > +	 * Capable but disabled CPUs can be brought online later. What about
> > +	 * the redistributor? ACPI doesn't want to say!
> > +	 * Virtual hotplug systems can use the MADT's "always-on" GICR entries.
> > +	 * Otherwise, prevent such CPUs from being brought online.
> > +	 */
> > +	if (!(gicc->flags & ACPI_MADT_ENABLED)) {
> > +		pr_warn("CPU %u's redistributor is inaccessible: this CPU can't be brought online\n", cpu);
> > +		cpumask_set_cpu(cpu, &broken_rdists);  
> 
> ... return -EINVAL, and then be passed to cpumask_set_cpu(), with
> interesting effects. It shouldn't happen, but I trust anything that
> comes from firmware tables as much as I trust a campaigning
> politician's promises. This should really result in the RD being
> considered unusable, but without affecting any CPU (there is no valid
> CPU the first place).
> 
> Another question is what get_cpu_for acpi_id() returns for a disabled
> CPU. A valid CPU number? Or -EINVAL?
It's a match function that works by iterating over 0 to nr_cpu_ids and

if (uid == get_acpi_id_for_cpu(cpu))

So the question become does get_acpi_id_for_cpu() return a valid CPU
number for a disabled CPU.

That uses acpi_cpu_get_madt_gicc(cpu)->uid so this all gets a bit circular.
That looks it up via cpu_madt_gicc[cpu] which after the proposed updated
patch is set if enabled or online capable.  There are however a few other
error checks in acpi_map_gic_cpu_interface() that could lead to it
not being set (MPIDR validity checks). I suspect all of these end up being
fatal elsewhere which is why this hasn't blown up before.

If any of those cases are possible we could get a null pointer
dereference.

Easy to harden this case via the following (which will leave us with
-EINVAL.  There are other call sites that might trip over this.
I'm inclined to harden them as a separate issue though so as not
to get in the way of this patch set.


diff --git a/arch/arm64/include/asm/acpi.h b/arch/arm64/include/asm/acpi.h
index bc9a6656fc0c..a407f9cd549e 100644
--- a/arch/arm64/include/asm/acpi.h
+++ b/arch/arm64/include/asm/acpi.h
@@ -124,7 +124,8 @@ static inline int get_cpu_for_acpi_id(u32 uid)
        int cpu;

        for (cpu = 0; cpu < nr_cpu_ids; cpu++)
-               if (uid == get_acpi_id_for_cpu(cpu))
+               if (acpi_cpu_get_madt_gicc(cpu) &&
+                   uid == get_acpi_id_for_cpu(cpu))
                        return cpu;

        return -EINVAL;

I'll spin an additional patch to make that change after testing I haven't
messed it up.

At the call site in gic_acpi_parse_madt_gicc() I'm not sure we can do better
than just skipping setting broken_rdists. I'll also pull the declaration of
that cpu variable down into this condition so it's more obvious we only
care about it in this error path.

Jonathan





> 
> Thanks,
> 
> 	M.
> 




More information about the linux-arm-kernel mailing list