[PATCH 1/4] arm64: alternative: wait for other CPUs before patching

Mark Rutland mark.rutland at arm.com
Mon Dec 13 05:49:07 PST 2021


On Mon, Dec 13, 2021 at 01:31:52PM +0000, Will Deacon wrote:
> On Fri, Dec 03, 2021 at 10:47:20AM +0000, Mark Rutland wrote:
> > In __apply_alternatives_multi_stop() we have a "really simple polling
> > protocol" to avoid patching code that is concurrently executed on other
> > CPUs. Secondary CPUs wait for the boot CPU to signal that patching is
> > complete, but the boot CPU doesn't wait for secondaries to enter the
> > polling loop, and it's possible that patching starts while secondaries
> > are still within the stop_machine logic.
> > 
> > Let's fix this by adding a vaguely simple polling protocol where the
> > boot CPU waits for secondaries to signal that they have entered the
> > unpatchable stop function. We can use the arch_atomic_*() functions for
> > this, as they are not patched with alternatives.
> > 
> > At the same time, let's make `all_alternatives_applied` local to
> > __apply_alternatives_multi_stop(), since it is only used there, and this
> > makes the code a little clearer.
> > 
> > Signed-off-by: Mark Rutland <mark.rutland at arm.com>
> > Cc: Andre Przywara <andre.przywara at arm.com>
> > Cc: Ard Biesheuvel <ardb at kernel.org>
> > Cc: Catalin Marinas <catalin.marinas at arm.com>
> > Cc: James Morse <james.morse at arm.com>
> > Cc: Joey Gouly <joey.gouly at arm.com>
> > Cc: Suzuki K Poulose <suzuki.poulose at arm.com>
> > Cc: Will Deacon <will at kernel.org>
> > ---
> >  arch/arm64/kernel/alternative.c | 17 ++++++++++++-----
> >  1 file changed, 12 insertions(+), 5 deletions(-)
> > 
> > diff --git a/arch/arm64/kernel/alternative.c b/arch/arm64/kernel/alternative.c
> > index 3fb79b76e9d9..4f32d4425aac 100644
> > --- a/arch/arm64/kernel/alternative.c
> > +++ b/arch/arm64/kernel/alternative.c
> > @@ -21,9 +21,6 @@
> >  #define ALT_ORIG_PTR(a)		__ALT_PTR(a, orig_offset)
> >  #define ALT_REPL_PTR(a)		__ALT_PTR(a, alt_offset)
> >  
> > -/* Volatile, as we may be patching the guts of READ_ONCE() */
> > -static volatile int all_alternatives_applied;
> > -
> >  static DECLARE_BITMAP(applied_alternatives, ARM64_NCAPS);
> >  
> >  struct alt_region {
> > @@ -193,11 +190,17 @@ static void __nocfi __apply_alternatives(struct alt_region *region, bool is_modu
> >  }
> >  
> >  /*
> > - * We might be patching the stop_machine state machine, so implement a
> > - * really simple polling protocol here.
> > + * Apply alternatives, ensuring that no CPUs are concurrently executing code
> > + * being patched.
> > + *
> > + * We might be patching the stop_machine state machine or READ_ONCE(), so
> > + * we implement a simple polling protocol.
> >   */
> >  static int __apply_alternatives_multi_stop(void *unused)
> >  {
> > +	/* Volatile, as we may be patching the guts of READ_ONCE() */
> > +	static volatile int all_alternatives_applied;
> > +	static atomic_t stopped_cpus = ATOMIC_INIT(0);
> >  	struct alt_region region = {
> >  		.begin	= (struct alt_instr *)__alt_instructions,
> >  		.end	= (struct alt_instr *)__alt_instructions_end,
> > @@ -205,12 +208,16 @@ static int __apply_alternatives_multi_stop(void *unused)
> >  
> >  	/* We always have a CPU 0 at this point (__init) */
> >  	if (smp_processor_id()) {
> > +		arch_atomic_inc(&stopped_cpus);
> 
> Why can't we use normal atomic_inc() here?

In case there's any explicit instrumentation enabled in the atomic_inc()
wrapper, since the instrumentation code may call into patchable code.

Today we'd get away with using atomic_inc(), since currently all the
instrumentation happens to be prior to the actual AMO, but generally to avoid
instrumentation we're supposed to use the arch_atomic_*() ops.

There are some other latent issues with calling into instrumentable code here,
which I plan to address in future patches, so if you want I can make this a
regular atomic_inc() for now and tackle that as a separate problem. Otherwise,
I can elaborate on the mention in the commit message to make that clearer.

> >  		while (!all_alternatives_applied)
> >  			cpu_relax();
> >  		isb();
> >  	} else {
> >  		DECLARE_BITMAP(remaining_capabilities, ARM64_NPATCHABLE);
> >  
> > +		while (arch_atomic_read(&stopped_cpus) != num_online_cpus() - 1)
> 
> and normal atomic_read() here?

Same story as above.

Thanks,
Mark.



More information about the linux-arm-kernel mailing list