[PATCH 1/4] arm64: alternative: wait for other CPUs before patching
Mark Rutland
mark.rutland at arm.com
Mon Dec 13 05:49:07 PST 2021
On Mon, Dec 13, 2021 at 01:31:52PM +0000, Will Deacon wrote:
> On Fri, Dec 03, 2021 at 10:47:20AM +0000, Mark Rutland wrote:
> > In __apply_alternatives_multi_stop() we have a "really simple polling
> > protocol" to avoid patching code that is concurrently executed on other
> > CPUs. Secondary CPUs wait for the boot CPU to signal that patching is
> > complete, but the boot CPU doesn't wait for secondaries to enter the
> > polling loop, and it's possible that patching starts while secondaries
> > are still within the stop_machine logic.
> >
> > Let's fix this by adding a vaguely simple polling protocol where the
> > boot CPU waits for secondaries to signal that they have entered the
> > unpatchable stop function. We can use the arch_atomic_*() functions for
> > this, as they are not patched with alternatives.
> >
> > At the same time, let's make `all_alternatives_applied` local to
> > __apply_alternatives_multi_stop(), since it is only used there, and this
> > makes the code a little clearer.
> >
> > Signed-off-by: Mark Rutland <mark.rutland at arm.com>
> > Cc: Andre Przywara <andre.przywara at arm.com>
> > Cc: Ard Biesheuvel <ardb at kernel.org>
> > Cc: Catalin Marinas <catalin.marinas at arm.com>
> > Cc: James Morse <james.morse at arm.com>
> > Cc: Joey Gouly <joey.gouly at arm.com>
> > Cc: Suzuki K Poulose <suzuki.poulose at arm.com>
> > Cc: Will Deacon <will at kernel.org>
> > ---
> > arch/arm64/kernel/alternative.c | 17 ++++++++++++-----
> > 1 file changed, 12 insertions(+), 5 deletions(-)
> >
> > diff --git a/arch/arm64/kernel/alternative.c b/arch/arm64/kernel/alternative.c
> > index 3fb79b76e9d9..4f32d4425aac 100644
> > --- a/arch/arm64/kernel/alternative.c
> > +++ b/arch/arm64/kernel/alternative.c
> > @@ -21,9 +21,6 @@
> > #define ALT_ORIG_PTR(a) __ALT_PTR(a, orig_offset)
> > #define ALT_REPL_PTR(a) __ALT_PTR(a, alt_offset)
> >
> > -/* Volatile, as we may be patching the guts of READ_ONCE() */
> > -static volatile int all_alternatives_applied;
> > -
> > static DECLARE_BITMAP(applied_alternatives, ARM64_NCAPS);
> >
> > struct alt_region {
> > @@ -193,11 +190,17 @@ static void __nocfi __apply_alternatives(struct alt_region *region, bool is_modu
> > }
> >
> > /*
> > - * We might be patching the stop_machine state machine, so implement a
> > - * really simple polling protocol here.
> > + * Apply alternatives, ensuring that no CPUs are concurrently executing code
> > + * being patched.
> > + *
> > + * We might be patching the stop_machine state machine or READ_ONCE(), so
> > + * we implement a simple polling protocol.
> > */
> > static int __apply_alternatives_multi_stop(void *unused)
> > {
> > + /* Volatile, as we may be patching the guts of READ_ONCE() */
> > + static volatile int all_alternatives_applied;
> > + static atomic_t stopped_cpus = ATOMIC_INIT(0);
> > struct alt_region region = {
> > .begin = (struct alt_instr *)__alt_instructions,
> > .end = (struct alt_instr *)__alt_instructions_end,
> > @@ -205,12 +208,16 @@ static int __apply_alternatives_multi_stop(void *unused)
> >
> > /* We always have a CPU 0 at this point (__init) */
> > if (smp_processor_id()) {
> > + arch_atomic_inc(&stopped_cpus);
>
> Why can't we use normal atomic_inc() here?
In case there's any explicit instrumentation enabled in the atomic_inc()
wrapper, since the instrumentation code may call into patchable code.
Today we'd get away with using atomic_inc(), since currently all the
instrumentation happens to be prior to the actual AMO, but generally to avoid
instrumentation we're supposed to use the arch_atomic_*() ops.
There are some other latent issues with calling into instrumentable code here,
which I plan to address in future patches, so if you want I can make this a
regular atomic_inc() for now and tackle that as a separate problem. Otherwise,
I can elaborate on the mention in the commit message to make that clearer.
> > while (!all_alternatives_applied)
> > cpu_relax();
> > isb();
> > } else {
> > DECLARE_BITMAP(remaining_capabilities, ARM64_NPATCHABLE);
> >
> > + while (arch_atomic_read(&stopped_cpus) != num_online_cpus() - 1)
>
> and normal atomic_read() here?
Same story as above.
Thanks,
Mark.
More information about the linux-arm-kernel
mailing list