[PATCH v2] ARM: implement optimized percpu variable access

Will Deacon will.deacon at arm.com
Thu Nov 29 10:05:09 EST 2012


Hi Rob,

Thanks for the v2. One comment inline...

On Thu, Nov 29, 2012 at 02:52:44PM +0000, Rob Herring wrote:
> From: Rob Herring <rob.herring at calxeda.com>
> 
> Use the previously unused TPIDRPRW register to store percpu offsets.
> TPIDRPRW is only accessible in PL1, so it can only be used in the kernel.
> 
> This replaces 2 loads with a mrc instruction for each percpu variable
> access. With hackbench, the performance improvement is 1.4% on Cortex-A9
> (highbank). Taking an average of 30 runs of "hackbench -l 1000" yields:
> 
> Before: 6.2191
> After: 6.1348
> 
> Will Deacon reported similar delta on v6 with 11MPCore.
> 
> The asm "memory" constraints are needed here to ensure the percpu offset
> gets reloaded. Testing by Will found that this would not happen in
> __schedule() which is a bit of a special case as preemption is disabled
> but the execution can move cores.
> 
> Signed-off-by: Rob Herring <rob.herring at calxeda.com>
> Acked-by: Will Deacon <will.deacon at arm.com>
> ---
> Changes in v2:
> - Add asm "memory" constraint
> - Only enable on v6K and v7 and avoid enabling for v6 SMP_ON_UP
> - Fix missing initialization of TPIDRPRW for resume path
> - Move cpu_init to beginning of secondary_start_kernel to ensure percpu
>   variables can be accessed as early as possible.

[...]

> diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
> index fbc8b26..aadcca7 100644
> --- a/arch/arm/kernel/smp.c
> +++ b/arch/arm/kernel/smp.c
> @@ -296,6 +296,8 @@ asmlinkage void __cpuinit secondary_start_kernel(void)
>  	struct mm_struct *mm = &init_mm;
>  	unsigned int cpu;
>  
> +	cpu_init();
> +
>  	/*
>  	 * The identity mapping is uncached (strongly ordered), so
>  	 * switch away from it before attempting any exclusive accesses.
> @@ -315,7 +317,6 @@ asmlinkage void __cpuinit secondary_start_kernel(void)
>  
>  	printk("CPU%u: Booted secondary processor\n", cpu);
>  
> -	cpu_init();
>  	preempt_disable();
>  	trace_hardirqs_off();

It's really not safe moving the cpu_init that early because we're running
strongly ordered at that point, so locks aren't working.

Will



More information about the linux-arm-kernel mailing list