[PATCH 03/16] ARM: b.L: introduce helpers for platform coherency exit/setup
Nicolas Pitre
nicolas.pitre at linaro.org
Thu Jan 10 12:59:54 EST 2013
On Thu, 10 Jan 2013, Catalin Marinas wrote:
> On 10 January 2013 00:20, Nicolas Pitre <nicolas.pitre at linaro.org> wrote:
> > --- /dev/null
> > +++ b/Documentation/arm/big.LITTLE/cluster-pm-race-avoidance.txt
> > @@ -0,0 +1,498 @@
> > +Big.LITTLE cluster Power-up/power-down race avoidance algorithm
> > +===============================================================
>
> Nice description and ascii art :).
>
> > --- a/arch/arm/common/bL_entry.c
> > +++ b/arch/arm/common/bL_entry.c
> > @@ -116,3 +116,163 @@ int bL_cpu_powered_up(void)
> > platform_ops->powered_up();
> > return 0;
> > }
> > +
> > +struct bL_sync_struct bL_sync;
> > +
> > +static void __sync_range(volatile void *p, size_t size)
> > +{
> > + char *_p = (char *)p;
> > +
> > + __cpuc_flush_dcache_area(_p, size);
> > + outer_flush_range(__pa(_p), __pa(_p + size));
> > + outer_sync();
>
> The outer flush-range operations already contain a cache_sync, so an
> additional outer_sync() operation is not necessary.
>
> You (well, Dave) said that you use the flush instead of
> clean/invalidate to avoid races with other CPUs writing the location.
Yes. To clarify for everyone, the issue here is that those state values
are being written and/or read by different CPUs which may or may not
have their cache enabled. And in some cases the L1 cache is disabled
but L2 is still enabled.
So a cached reader must invalidate the cache to ensure it reads an
up-to-date value from RAM since the last update might have come from a
CPU with its cache disabled. But invalidating the cache might discard
the newly updated state from a writer with an active cache before that
writer had the chance to clean its cache to RAM. Therefore, using a
cache flush rather than a cache invalidate before every reads solves
this race.
> However, on the same CPU you can get a speculative load into L1 after
> the L1 flush but before the L2 flush, so the reader case can fail.
>
> The sequence for readers is (note *L2* inval first):
>
> L2 inval
> L1 inval
As you noted below and as I explained above, this can't be an inval
operation as that could discard a concurrent writer's update.
> The sequence for writers is:
>
> L1 clean
> L2 clean
>
> The bi-directional sequence (that's what you need) is:
>
> L1 clean
> L2 clean+inval
> L1 clean+inval
>
> The last L1 op must be clean+inval in case another CPU writes to this
> location to avoid discarding the write.
>
> If you don't have an L2, you just end up with two L1 clean ops, so you
> can probably put some checks.
In fact, since this is only used on A7/A15 right now, there is no outer
cache and the outer calls are effectively no-ops. I'm wondering if
those should simply be removed until/unless there is some system showing
up with a need for them.
> > +#define sync_mem(ptr) __sync_range(ptr, sizeof *(ptr))
> > +
> > +/*
> > + * __bL_cpu_going_down: Indicates that the cpu is being torn down.
> > + * This must be called at the point of committing to teardown of a CPU.
> > + * The CPU cache (SCTRL.C bit) is expected to still be active.
> > + */
> > +void __bL_cpu_going_down(unsigned int cpu, unsigned int cluster)
> > +{
> > + bL_sync.clusters[cluster].cpus[cpu].cpu = CPU_GOING_DOWN;
> > + sync_mem(&bL_sync.clusters[cluster].cpus[cpu].cpu);
> > +}
> > +
> > +/*
> > + * __bL_cpu_down: Indicates that cpu teardown is complete and that the
> > + * cluster can be torn down without disrupting this CPU.
> > + * To avoid deadlocks, this must be called before a CPU is powered down.
> > + * The CPU cache (SCTRL.C bit) is expected to be off.
> > + */
> > +void __bL_cpu_down(unsigned int cpu, unsigned int cluster)
> > +{
> > + dsb();
> > + bL_sync.clusters[cluster].cpus[cpu].cpu = CPU_DOWN;
> > + sync_mem(&bL_sync.clusters[cluster].cpus[cpu].cpu);
> > + sev();
>
> For the sev() here (and other functions in this patch) you need a
> dsb() before. I'm not sure outer_sync() has one.
__cpuc_flush_dcache_area() does though, via v7_flush_kern_dcache_area.
Nicolas
More information about the linux-arm-kernel
mailing list