[PATCH] ARM: implement optimized percpu variable access
Will Deacon
will.deacon at arm.com
Mon Nov 26 06:13:37 EST 2012
Hi Rob,
On Sun, Nov 25, 2012 at 06:46:55PM +0000, Rob Herring wrote:
> On 11/22/2012 05:34 AM, Will Deacon wrote:
> > As an aside, you also need to make the asm block volatile in
> > __my_cpu_offset -- I can see it being re-ordered before the set for
> > secondary CPUs otherwise.
>
> I don't think that is right. Doing that means the register is reloaded
> on every access and you end up with code like this (from handle_IRQ):
>
> c000eb4c: ee1d2f90 mrc 15, 0, r2, cr13, cr0, {4}
> c000eb50: e7926003 ldr r6, [r2, r3]
> c000eb54: ee1d2f90 mrc 15, 0, r2, cr13, cr0, {4}
> c000eb58: e7821003 str r1, [r2, r3]
> c000eb5c: eb006cb1 bl c0029e28 <irq_enter>
>
> I don't really see where there would be a re-ordering issue. There's no
> percpu var access before or near the setting that I can see.
Well my A15 doesn't boot with your original patch unless I make that thing
volatile, so something does need tweaking...
The issue is on bringing up the secondary core, so I assumed that a lot
of inlining goes on inside secondary_start_kernel and then the result is
shuffled around, placing a cpu-offset read before we've done the set.
Unfortunately, looking at the disassembly I can't see this happening at
all, so I'll keep digging. The good news is that I've just reproduced the
problem on the model, so I've got more visibility now (although both cores
are just stuck in spinlocks...).
Will
More information about the linux-arm-kernel
mailing list