[PATCH 1/4] arm64: alternative: wait for other CPUs before patching

Will Deacon will at kernel.org
Mon Dec 13 05:31:52 PST 2021


On Fri, Dec 03, 2021 at 10:47:20AM +0000, Mark Rutland wrote:
> In __apply_alternatives_multi_stop() we have a "really simple polling
> protocol" to avoid patching code that is concurrently executed on other
> CPUs. Secondary CPUs wait for the boot CPU to signal that patching is
> complete, but the boot CPU doesn't wait for secondaries to enter the
> polling loop, and it's possible that patching starts while secondaries
> are still within the stop_machine logic.
> 
> Let's fix this by adding a vaguely simple polling protocol where the
> boot CPU waits for secondaries to signal that they have entered the
> unpatchable stop function. We can use the arch_atomic_*() functions for
> this, as they are not patched with alternatives.
> 
> At the same time, let's make `all_alternatives_applied` local to
> __apply_alternatives_multi_stop(), since it is only used there, and this
> makes the code a little clearer.
> 
> Signed-off-by: Mark Rutland <mark.rutland at arm.com>
> Cc: Andre Przywara <andre.przywara at arm.com>
> Cc: Ard Biesheuvel <ardb at kernel.org>
> Cc: Catalin Marinas <catalin.marinas at arm.com>
> Cc: James Morse <james.morse at arm.com>
> Cc: Joey Gouly <joey.gouly at arm.com>
> Cc: Suzuki K Poulose <suzuki.poulose at arm.com>
> Cc: Will Deacon <will at kernel.org>
> ---
>  arch/arm64/kernel/alternative.c | 17 ++++++++++++-----
>  1 file changed, 12 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/arm64/kernel/alternative.c b/arch/arm64/kernel/alternative.c
> index 3fb79b76e9d9..4f32d4425aac 100644
> --- a/arch/arm64/kernel/alternative.c
> +++ b/arch/arm64/kernel/alternative.c
> @@ -21,9 +21,6 @@
>  #define ALT_ORIG_PTR(a)		__ALT_PTR(a, orig_offset)
>  #define ALT_REPL_PTR(a)		__ALT_PTR(a, alt_offset)
>  
> -/* Volatile, as we may be patching the guts of READ_ONCE() */
> -static volatile int all_alternatives_applied;
> -
>  static DECLARE_BITMAP(applied_alternatives, ARM64_NCAPS);
>  
>  struct alt_region {
> @@ -193,11 +190,17 @@ static void __nocfi __apply_alternatives(struct alt_region *region, bool is_modu
>  }
>  
>  /*
> - * We might be patching the stop_machine state machine, so implement a
> - * really simple polling protocol here.
> + * Apply alternatives, ensuring that no CPUs are concurrently executing code
> + * being patched.
> + *
> + * We might be patching the stop_machine state machine or READ_ONCE(), so
> + * we implement a simple polling protocol.
>   */
>  static int __apply_alternatives_multi_stop(void *unused)
>  {
> +	/* Volatile, as we may be patching the guts of READ_ONCE() */
> +	static volatile int all_alternatives_applied;
> +	static atomic_t stopped_cpus = ATOMIC_INIT(0);
>  	struct alt_region region = {
>  		.begin	= (struct alt_instr *)__alt_instructions,
>  		.end	= (struct alt_instr *)__alt_instructions_end,
> @@ -205,12 +208,16 @@ static int __apply_alternatives_multi_stop(void *unused)
>  
>  	/* We always have a CPU 0 at this point (__init) */
>  	if (smp_processor_id()) {
> +		arch_atomic_inc(&stopped_cpus);

Why can't we use normal atomic_inc() here?

>  		while (!all_alternatives_applied)
>  			cpu_relax();
>  		isb();
>  	} else {
>  		DECLARE_BITMAP(remaining_capabilities, ARM64_NPATCHABLE);
>  
> +		while (arch_atomic_read(&stopped_cpus) != num_online_cpus() - 1)

and normal atomic_read() here?

Will



More information about the linux-arm-kernel mailing list