[RFC PATCH v2 27/35] ACPICA: Add new MADT GICC flags fields [code first?]

Salil Mehta salil.mehta at huawei.com
Thu Sep 14 19:29:13 PDT 2023


Hi Ard,

> From: Ard Biesheuvel <ardb at kernel.org>
> Sent: Thursday, September 14, 2023 4:34 PM
> To: Jonathan Cameron <jonathan.cameron at huawei.com>
> Cc: James Morse <james.morse at arm.com>; linux-pm at vger.kernel.org;
> loongarch at lists.linux.dev; linux-acpi at vger.kernel.org; linux-
> arch at vger.kernel.org; linux-kernel at vger.kernel.org; linux-arm-
> kernel at lists.infradead.org; linux-riscv at lists.infradead.org;
> kvmarm at lists.linux.dev; x86 at kernel.org; Salil Mehta
> <salil.mehta at huawei.com>; Russell King <linux at armlinux.org.uk>; Jean-
> Philippe Brucker <jean-philippe at linaro.org>; jianyong.wu at arm.com;
> justin.he at arm.com
> Subject: Re: [RFC PATCH v2 27/35] ACPICA: Add new MADT GICC flags fields
> [code first?]
> 
> On Thu, 14 Sept 2023 at 16:55, Jonathan Cameron
> <Jonathan.Cameron at huawei.com> wrote:
> >
> > On Thu, 14 Sep 2023 09:57:44 +0200
> > Ard Biesheuvel <ardb at kernel.org> wrote:
> >
> > > Hello James,
> > >
> > > On Wed, 13 Sept 2023 at 18:41, James Morse <james.morse at arm.com> wrote:
> > > >
> > > > Add the new flag field to the MADT's GICC structure.
> > > >
> > > > 'Online Capable' indicates a disabled CPU can be enabled later.
> > > >
> > >
> > > Why do we need a bit for this? What would be the point of describing
> > > disabled CPUs that cannot be enabled (and are you are aware of
> > > firmware doing this?).
> >
> > Enabled being not set is common at some similar ACPI tables at least.
> >
> > This is available in most ACPI tables to allow firmware to use 'nearly'
> > static tables and just tweak the 'enabled' bit to say if the record should
> > be ignored or not. Also _STA not present which is for same trick.
> > If you are doing clever dynamic tables, then you can just not present
> > the entry.
> >
> > With that existing use case in mind, need another bit to say this
> > one might one day turn up.  Note this is copied from x86 though no
> > one seems to have implemented the kernel support for them yet.
> >
> > Note as per my other reply - this isn't a code first proposal. It's in the
> > spec already (via a code first proposal last year I think).
> >
> > >
> > > So why are we not able to assume that this new bit can always be treated as '1'?
> >
> > Given above, need the extra bit to size stuff to allow for the CPU showing up
> > late.
> >
> 
> So does this mean that on x86, the CPU object is instantiated only
> when the hardware level hotplug occurs? And before that, the object
> does not exist at all?

That is correct but I am not sure if the presence of hardware Hotplug
on x86 is even true. It all hidden behind firmware magic (I think). So
x86 is able to use same infrastructure both for virtual and physical
CPU Hotplug.

From the ACPI 6.3 > x86 have started to use online-capable bit for local
x2apic in the MADT Table

https://lore.kernel.org/lkml/168016878002.404.5262105401164408214.tip-bot2@tip-bot2/
https://lore.kernel.org/lkml/168016878085.404.6003734700616193238.tip-bot2@tip-bot2/

But there is a subtle difference in the way it is being used on x86
and on the ARM platform right now.

On x86, during init, if the MADT entry for LAPIC is found to be
online-capable and is enabled as well then possible and present
cpumask gets set and a logical cpu-id is also allocated. If the
MADT entry is online-capable but not enabled then disabled cpus
are still counted but logical cpu-id is not allocated during
init time and in fact setting present mask bits are also
deferred till Hotplug happens later.

static int acpi_register_lapic(int id, u32 acpiid, u8 enabled)
{
      [...]  
	if (!enabled) {  /* Not ACPI_MADT_ENABLED */
		++disabled_cpus;
		return -EINVAL;
	}

      [...]  

	cpu = generic_processor_info(id, ver); /* logical cupid, present mask*/

      [...]
	return cpu;
}

acpi_parse_x2apic(union acpi_subtable_headers * header, const unsigned long end)
{
	struct acpi_madt_local_x2apic *processor = NULL;

	processor = (struct acpi_madt_local_x2apic *)header;

      [...] 

      enabled = processor->lapic_flags & ACPI_MADT_ENABLED;

      [...]

	/* don't register processors that cannot be onlined */
	if (!acpi_is_processor_usable(processor->lapic_flags))
		return 0;

      [...]

	acpi_register_lapic(apic_id, processor->uid, enabled);

	return 0;
}

On ARM, we similarly identify all MADT GICC entries which are
*usable* i.e. either are *ENABLED* or *online-capable*. But
Unlike x86, all cpus corresponding to usable MADT GICC entries
gets logical cpu-ds allocated and their present bit mask set
during boot itself. Hence, present mask is always equal to
the possible cpus mask on ARM.

https://uefi.org/specs/ACPI/6.5/05_ACPI_Software_Programming_Model.html#gicc-cpu-interface-flags


For online-capable but *not* enabled CPUs we defer the
registration of the logical CPU-ids with the Linux Driver Model
till the time ACPI Hotplug event occurs. This means
register_cpu() is not called for the disabled CPUs during
init time. Hence, sysfs entries for the disabled CPUs
don’t exits.

But above creates bit of confusion to a x86 accustomed users
as on ARM with our solution, present CPUs are always equal to
possible CPUs. 

$ cat /sys/devices/system/cpu/possible
0-5
$ cat /sys/devices/system/cpu/present
0-5
$ cat /sys/devices/system/cpu/online
0-1
$ cat /sys/devices/system/cpu/offline
2-5

There is no way to know which CPUs have been hotplugged
using above interface. Hence, we have also a new mask
of enabled CPUs in the

$ cat /sys/devices/system/cpu/possible
0-5
$ cat /sys/devices/system/cpu/present
0-5
$ cat /sys/devices/system/cpu/enabled
0-2
$ cat /sys/devices/system/cpu/online
0-1
$ cat /sys/devices/system/cpu/offline
2-5

Qemu parameters: -smp cpu=3 maxcpus=6
Kernel parameter: maxcpus=2 


> 
> Because it seems to me that _STA, having both enabled and present
> bits, could already describe what we need here, and arguably, a CPU
> that is not both present and enabled should not be used by the OS.
> This would leave room for representing off-line CPUs as present but
> not enabled.

That is correct understanding.

For plugged cpus:
_STA.Present=1 and _STA.Enabled=1

For unplugged cpus:
_STA.Present=1 and _STA.Enabled=0

Hot(un)plugging is only allowed if during boot the GICC entries were
discovered as *online-capable*. GICC entries which are MADT GICC
enabled during boot cannot be hot-unplugged either.  

Catch:
If hot unplugging is to be supported for all cpus except the boot
then we MUST set all CPUs except boot CPUs as *online-capable*. 
This poses compatibility problems with the legacy OS running over
latest machines/platforms supporting Hotplug feature. OS might
ignore all the online-capable bits during boot time and hence only
1 CPU i.e. boot cpus might appear.

Hence, MADT.GICC.Enabled bits and MADT.GICC.online-capable need
Not be mutually exclusive. This requires more discussions!

You might find below useful:
https://kvm-forum.qemu.org/2023/talk/9SMPDQ/



> 
> Apologies if I am missing something obvious here - the whole rationale
> behind this thing is rather confusing to me.


More information about the linux-riscv mailing list