[PATCH RFC v3 17/21] ACPI: add support to register CPUs based on the _STA enabled bit
Russell King (Oracle)
linux at armlinux.org.uk
Tue Jan 23 05:10:44 PST 2024
On Tue, Jan 23, 2024 at 10:26:03AM +0000, Jonathan Cameron wrote:
> On Tue, 2 Jan 2024 14:53:20 +0000
> Jonathan Cameron <Jonathan.Cameron at Huawei.com> wrote:
>
> > On Mon, 18 Dec 2023 13:03:32 +0000
> > "Russell King (Oracle)" <linux at armlinux.org.uk> wrote:
> >
> > > On Wed, Dec 13, 2023 at 12:50:38PM +0000, Russell King wrote:
> > > > From: James Morse <james.morse at arm.com>
> > > >
> > > > acpi_processor_get_info() registers all present CPUs. Registering a
> > > > CPU is what creates the sysfs entries and triggers the udev
> > > > notifications.
> > > >
> > > > arm64 virtual machines that support 'virtual cpu hotplug' use the
> > > > enabled bit to indicate whether the CPU can be brought online, as
> > > > the existing ACPI tables require all hardware to be described and
> > > > present.
> > > >
> > > > If firmware describes a CPU as present, but disabled, skip the
> > > > registration. Such CPUs are present, but can't be brought online for
> > > > whatever reason. (e.g. firmware/hypervisor policy).
> > > >
> > > > Once firmware sets the enabled bit, the CPU can be registered and
> > > > brought online by user-space. Online CPUs, or CPUs that are missing
> > > > an _STA method must always be registered.
> > >
> > > ...
> > >
> > > > @@ -526,6 +552,9 @@ static void acpi_processor_post_eject(struct acpi_device *device)
> > > > acpi_processor_make_not_present(device);
> > > > return;
> > > > }
> > > > +
> > > > + if (cpu_present(pr->id) && !(sta & ACPI_STA_DEVICE_ENABLED))
> > > > + arch_unregister_cpu(pr->id);
> > >
> > > This change isn't described in the commit log, but seems to be the cause
> > > of the build error identified by the kernel build bot that is fixed
> > > later in this series. I'm wondering whether this should be in a
> > > different patch, maybe "ACPI: Check _STA present bit before making CPUs
> > > not present" ?
> >
> > Would seem a bit odd to call arch_unregister_cpu() way before the code
> > is added to call the matching arch_registers_cpu()
> >
> > Mind you this eject doesn't just apply to those CPUs that are registered
> > later I think, but instead to all. So we run into the spec hole that
> > there is no way to identify initially 'enabled' CPUs that might be disabled
> > later.
> >
> > >
> > > Or maybe my brain isn't working properly (due to being Covid positive.)
> > > Any thoughts, Jonathan?
> >
> > I'll go with a resounding 'not sure' on where this change belongs.
> > I blame my non existent start of the year hangover.
> > Hope you have recovered!
>
> Looking again, I think you were right, move it to that earlier patch.
I'm having second thoughts - because this patch introduces the
arch_register_cpu() into the acpi_processor_add() path (via
acpi_processor_get_info() and acpi_processor_make_enabled(), so isn't
it also correct to add arch_unregister_cpu() to the detach/post_eject
path as well? If we add one without the other, doesn't stuff become
a bit asymetric?
Looking more deeply at these changes, I'm finding it isn't easy to
keep track of everything that's going on here.
We had attach()/detach() callbacks that were nice and symetrical.
How we have this post_eject() callback that makes things asymetrical.
We have the attach() method that registers the CPU, but no detach
method, instead having the post_eject() method. On the face of it,
arch_unregister_cpu() doesn't look symetric unless one goes digging
more in the code - by that, I mean arch_register_cpu() only gets
called of present=1 _and_ enabled=1. However, arch_unregister_cpu()
gets called buried in acpi_processor_make_not_present(), called when
present=0, and then we have this new one to handle the case where
enabled=0. It is not obvious that arch_unregister_cpu() is the reverse
of what happens with arch_register_cpu() here.
Then we have the add() method allocating pr->throttling.shared_cpu_map,
and acpi_processor_make_not_present() freeing it. From what I read in
ACPI v6.5, enabled is not allowed to be set without present. So, if
_STA reports that a CPU that had present=1 enabled=1, but then is
later reported to be enabled=0 (which we handle by calling only
arch_unregister_cpu()) then what happens when _STA changes to
enabled=1 later? Does add() get called? If it does, this would cause
a new acpi_processor structure to be allocated and the old one to be
leaked... I hope I'm wrong about add() being called - but if it isn't,
how does enabled going from 0->1 get handled... and if we are handling
its 1->0 transition separately from present, then surely we should be
handling that.
Maybe I'm just getting confused, but I've spent much of this morning
trying to unravel all this... and I'm of the opinion that this isn't a
sign of a good approach.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
More information about the linux-riscv
mailing list