[PATCH 03/16] ARM: b.L: introduce helpers for platform coherency exit/setup
Santosh Shilimkar
santosh.shilimkar at ti.com
Fri Jan 11 13:34:40 EST 2013
On Friday 11 January 2013 11:37 PM, Dave Martin wrote:
> On Fri, Jan 11, 2013 at 11:16:18PM +0530, Santosh Shilimkar wrote:
>
> [...]
>
>>> +Originally created and documented by Dave Martin for Linaro Limited, in
>>> +collaboration with Nicolas Pitre and Achin Gupta.
>>> +
>> Great write-up Dave!! I might have to do couple of more passes on it to
>> get overall idea, but surely this documentation is good start for
>> anybody reading/reviewing the big.LITTLE switcher code.
>
> Thanks for reading through it. Partly, this was insurance against me
> forgetting how the code worked in between writing and posting it...
> but this is all quite subtle code, so it felt important to document
> it thoroughly.
>
>>
>>> +Copyright (C) 2012 Linaro Limited
>>> +Distributed under the terms of Version 2 of the GNU General Public
>>> +License, as defined in linux/COPYING.
>>> diff --git a/arch/arm/common/bL_entry.c b/arch/arm/common/bL_entry.c
>>> index 41de0622de..1ea4ec9df0 100644
>>> --- a/arch/arm/common/bL_entry.c
>>> +++ b/arch/arm/common/bL_entry.c
>>> @@ -116,3 +116,163 @@ int bL_cpu_powered_up(void)
>>> platform_ops->powered_up();
>>> return 0;
>>> }
>>> +
>>> +struct bL_sync_struct bL_sync;
>>> +
>>> +static void __sync_range(volatile void *p, size_t size)
>>> +{
>>> + char *_p = (char *)p;
>>> +
>>> + __cpuc_flush_dcache_area(_p, size);
>>> + outer_flush_range(__pa(_p), __pa(_p + size));
>>> + outer_sync();
>>> +}
>>> +
>>> +#define sync_mem(ptr) __sync_range(ptr, sizeof *(ptr))
>>> +
>>> +/*
>> /** as per kerneldoc.
>
> Does kerneldoc not require the comment to be specially formatted?
>
> I haven't played with that, so far.
>
>>
>>> + * __bL_cpu_going_down: Indicates that the cpu is being torn down.
>>> + * This must be called at the point of committing to teardown of a CPU.
>>> + * The CPU cache (SCTRL.C bit) is expected to still be active.
>>> + */
>>> +void __bL_cpu_going_down(unsigned int cpu, unsigned int cluster)
>>> +{
>>> + bL_sync.clusters[cluster].cpus[cpu].cpu = CPU_GOING_DOWN;
>>> + sync_mem(&bL_sync.clusters[cluster].cpus[cpu].cpu);
>>> +}
>>> +
>>
>> [..]
>>
>>> diff --git a/arch/arm/common/bL_head.S b/arch/arm/common/bL_head.S
>>> index 9d351f2b4c..f7a64ac127 100644
>>> --- a/arch/arm/common/bL_head.S
>>> +++ b/arch/arm/common/bL_head.S
>>> @@ -7,11 +7,19 @@
>>> * This program is free software; you can redistribute it and/or modify
>>> * it under the terms of the GNU General Public License version 2 as
>>> * published by the Free Software Foundation.
>>> + *
>>> + *
>>> + * Refer to Documentation/arm/big.LITTLE/cluster-pm-race-avoidance.txt
>>> + * for details of the synchronisation algorithms used here.
>>> */
>>>
>>> #include <linux/linkage.h>
>>> #include <asm/bL_entry.h>
>>>
>>> +.if BL_SYNC_CLUSTER_CPUS
>>> +.error "cpus must be the first member of struct bL_cluster_sync_struct"
>>> +.endif
>>> +
>>> .macro pr_dbg cpu, string
>>> #if defined(CONFIG_DEBUG_LL) && defined(DEBUG)
>>> b 1901f
>>> @@ -52,12 +60,82 @@ ENTRY(bL_entry_point)
>>> 2: pr_dbg r4, "kernel bL_entry_point\n"
>>>
>>> /*
>>> - * MMU is off so we need to get to bL_entry_vectors in a
>>> + * MMU is off so we need to get to various variables in a
>>> * position independent way.
>>> */
>>> adr r5, 3f
>>> - ldr r6, [r5]
>>> + ldmia r5, {r6, r7, r8}
>>> add r6, r5, r6 @ r6 = bL_entry_vectors
>>> + ldr r7, [r5, r7] @ r7 = bL_power_up_setup_phys
>>> + add r8, r5, r8 @ r8 = bL_sync
>>> +
>>> + mov r0, #BL_SYNC_CLUSTER_SIZE
>>> + mla r8, r0, r10, r8 @ r8 = bL_sync cluster base
>>> +
>>> + @ Signal that this CPU is coming UP:
>>> + mov r0, #CPU_COMING_UP
>>> + mov r5, #BL_SYNC_CPU_SIZE
>>> + mla r5, r9, r5, r8 @ r5 = bL_sync cpu address
>>> + strb r0, [r5]
>>> +
>>> + dsb
>> Do you really need above dsb(). With MMU off, the the store should
>
> The short answer is "maybe not". Some of the barriers can be
> eliminated; some can be demoted to DSBs. Others may be required but
> unnecessarily duplicated e.g., between bL_head.S and vlock.S.
>
>> any way make it to the main memory, No ?
>
> Yes, but this raises issues about precisely what the architecture
> guarantees about memory ordering in these scenarios. The only obvious
> thing about that is that it's non-obvious.
>
Well at least ARM documents clearly says the memory accesses will be
treated as strongly ordered with MMU OFF and that means they expect
to make it to main memory.
> Strongly-Ordered memory is not quite the same as having explicit
> barriers everywhere.
>
> I need to have a careful think, but it should be possible to optimise
> a bit here.
>
If the CCI comes in between that rule and if it needs a barrier to let
it flush is WB to main memory then thats a different story.
Anyway thanks for the answer.
Regards
Santosh
More information about the linux-arm-kernel
mailing list