[PATCH 1/4] arm64: alternative: wait for other CPUs before patching

Mark Rutland mark.rutland at arm.com
Mon Dec 13 05:54:39 PST 2021


On Mon, Dec 13, 2021 at 01:41:46PM +0000, Will Deacon wrote:
> On Mon, Dec 13, 2021 at 01:31:52PM +0000, Will Deacon wrote:
> > On Fri, Dec 03, 2021 at 10:47:20AM +0000, Mark Rutland wrote:
> > > In __apply_alternatives_multi_stop() we have a "really simple polling
> > > protocol" to avoid patching code that is concurrently executed on other
> > > CPUs. Secondary CPUs wait for the boot CPU to signal that patching is
> > > complete, but the boot CPU doesn't wait for secondaries to enter the
> > > polling loop, and it's possible that patching starts while secondaries
> > > are still within the stop_machine logic.
> > > 
> > > Let's fix this by adding a vaguely simple polling protocol where the
> > > boot CPU waits for secondaries to signal that they have entered the
> > > unpatchable stop function. We can use the arch_atomic_*() functions for
> > > this, as they are not patched with alternatives.
> > > 
> > > At the same time, let's make `all_alternatives_applied` local to
> > > __apply_alternatives_multi_stop(), since it is only used there, and this
> > > makes the code a little clearer.
> > > 
> > > Signed-off-by: Mark Rutland <mark.rutland at arm.com>
> > > Cc: Andre Przywara <andre.przywara at arm.com>
> > > Cc: Ard Biesheuvel <ardb at kernel.org>
> > > Cc: Catalin Marinas <catalin.marinas at arm.com>
> > > Cc: James Morse <james.morse at arm.com>
> > > Cc: Joey Gouly <joey.gouly at arm.com>
> > > Cc: Suzuki K Poulose <suzuki.poulose at arm.com>
> > > Cc: Will Deacon <will at kernel.org>
> > > ---
> > >  arch/arm64/kernel/alternative.c | 17 ++++++++++++-----
> > >  1 file changed, 12 insertions(+), 5 deletions(-)
> > > 
> > > diff --git a/arch/arm64/kernel/alternative.c b/arch/arm64/kernel/alternative.c
> > > index 3fb79b76e9d9..4f32d4425aac 100644
> > > --- a/arch/arm64/kernel/alternative.c
> > > +++ b/arch/arm64/kernel/alternative.c
> > > @@ -21,9 +21,6 @@
> > >  #define ALT_ORIG_PTR(a)		__ALT_PTR(a, orig_offset)
> > >  #define ALT_REPL_PTR(a)		__ALT_PTR(a, alt_offset)
> > >  
> > > -/* Volatile, as we may be patching the guts of READ_ONCE() */
> > > -static volatile int all_alternatives_applied;
> > > -
> > >  static DECLARE_BITMAP(applied_alternatives, ARM64_NCAPS);
> > >  
> > >  struct alt_region {
> > > @@ -193,11 +190,17 @@ static void __nocfi __apply_alternatives(struct alt_region *region, bool is_modu
> > >  }
> > >  
> > >  /*
> > > - * We might be patching the stop_machine state machine, so implement a
> > > - * really simple polling protocol here.
> > > + * Apply alternatives, ensuring that no CPUs are concurrently executing code
> > > + * being patched.
> > > + *
> > > + * We might be patching the stop_machine state machine or READ_ONCE(), so
> > > + * we implement a simple polling protocol.
> > >   */
> > >  static int __apply_alternatives_multi_stop(void *unused)
> > >  {
> > > +	/* Volatile, as we may be patching the guts of READ_ONCE() */
> > > +	static volatile int all_alternatives_applied;
> > > +	static atomic_t stopped_cpus = ATOMIC_INIT(0);
> > >  	struct alt_region region = {
> > >  		.begin	= (struct alt_instr *)__alt_instructions,
> > >  		.end	= (struct alt_instr *)__alt_instructions_end,
> > > @@ -205,12 +208,16 @@ static int __apply_alternatives_multi_stop(void *unused)
> > >  
> > >  	/* We always have a CPU 0 at this point (__init) */
> > >  	if (smp_processor_id()) {
> > > +		arch_atomic_inc(&stopped_cpus);
> > 
> > Why can't we use normal atomic_inc() here?
> 
> Ah, ok, this is to deal with instrumentation and you add 'noinstr' when you
> factor this out later on. It does, however, mean that we need to be really
> careful with this because we're relying on (a) our atomics patching using
> static keys and (b) static key patching not requiring stop_machine().
> 
> In particular, we cannot backport this to kernels where the atomics were
> patched directly.

Another option here would be to use the __ll_sc_*() atomics directly, which at
least will break the build if backported too far?

> > >  		while (!all_alternatives_applied)
> > >  			cpu_relax();
> > >  		isb();
> > >  	} else {
> > >  		DECLARE_BITMAP(remaining_capabilities, ARM64_NPATCHABLE);
> > >  
> > > +		while (arch_atomic_read(&stopped_cpus) != num_online_cpus() - 1)
> > 
> > and normal atomic_read() here?
> 
> This one I'm still thinking doesn't need the arch_ prefix.

We could use a regular atomic_read() here, yes.

I'd used the arch_atomic_*() from for consistency with the inc().

Thanks,
Mark.



More information about the linux-arm-kernel mailing list