[PATCH 01/16] ARM: b.L: secondary kernel entry code
Nicolas Pitre
nicolas.pitre at linaro.org
Thu Jan 10 20:26:21 EST 2013
On Thu, 10 Jan 2013, Will Deacon wrote:
> On Thu, Jan 10, 2013 at 12:20:36AM +0000, Nicolas Pitre wrote:
> > CPUs in a big.LITTLE systems have special needs when entering the kernel
> > due to a hotplug event, or when resuming from a deep sleep mode.
> >
> > This is vectorized so multiple CPUs can enter the kernel in parallel
> > without serialization.
> >
> > Only the basic structure is introduced here. This will be extended
> > later.
> >
> > TODO: MPIDR based indexing should eventually be made runtime adjusted.
>
> Agreed.
>
> > diff --git a/arch/arm/common/bL_entry.c b/arch/arm/common/bL_entry.c
> > new file mode 100644
> > index 0000000000..80fff49417
> > --- /dev/null
> > +++ b/arch/arm/common/bL_entry.c
> > @@ -0,0 +1,30 @@
> > +/*
> > + * arch/arm/common/bL_entry.c -- big.LITTLE kernel re-entry point
> > + *
> > + * Created by: Nicolas Pitre, March 2012
> > + * Copyright: (C) 2012 Linaro Limited
> > + *
> > + * This program is free software; you can redistribute it and/or modify
> > + * it under the terms of the GNU General Public License version 2 as
> > + * published by the Free Software Foundation.
> > + */
> > +
> > +#include <linux/kernel.h>
> > +#include <linux/init.h>
> > +
> > +#include <asm/bL_entry.h>
> > +#include <asm/barrier.h>
> > +#include <asm/proc-fns.h>
> > +#include <asm/cacheflush.h>
> > +
> > +extern volatile unsigned long bL_entry_vectors[BL_NR_CLUSTERS][BL_CPUS_PER_CLUSTER];
>
> Does this actually need to be volatile? I'd have thought a compiler
> barrier in place of the smp_wmb below would be enough (following on from
> Catalin's comments).
Actually, I did the reverse i.e. I removed the smp_wmb() entirely. A
compiler barrier forces the whole world to memory while here we only
want this particular assignment to be pushed out.
Furthermore, I like the volatile as it flags that this is a special
variable which in this case is also accessed from CPUs with no cache.
> > +void bL_set_entry_vector(unsigned cpu, unsigned cluster, void *ptr)
> > +{
> > + unsigned long val = ptr ? virt_to_phys(ptr) : 0;
> > + bL_entry_vectors[cluster][cpu] = val;
> > + smp_wmb();
> > + __cpuc_flush_dcache_area((void *)&bL_entry_vectors[cluster][cpu], 4);
> > + outer_clean_range(__pa(&bL_entry_vectors[cluster][cpu]),
> > + __pa(&bL_entry_vectors[cluster][cpu + 1]));
> > +}
> > diff --git a/arch/arm/common/bL_head.S b/arch/arm/common/bL_head.S
> > new file mode 100644
> > index 0000000000..9d351f2b4c
> > --- /dev/null
> > +++ b/arch/arm/common/bL_head.S
> > @@ -0,0 +1,81 @@
> > +/*
> > + * arch/arm/common/bL_head.S -- big.LITTLE kernel re-entry point
> > + *
> > + * Created by: Nicolas Pitre, March 2012
> > + * Copyright: (C) 2012 Linaro Limited
> > + *
> > + * This program is free software; you can redistribute it and/or modify
> > + * it under the terms of the GNU General Public License version 2 as
> > + * published by the Free Software Foundation.
> > + */
> > +
> > +#include <linux/linkage.h>
> > +#include <asm/bL_entry.h>
> > +
> > + .macro pr_dbg cpu, string
> > +#if defined(CONFIG_DEBUG_LL) && defined(DEBUG)
> > + b 1901f
> > +1902: .ascii "CPU 0: \0CPU 1: \0CPU 2: \0CPU 3: \0"
> > + .ascii "CPU 4: \0CPU 5: \0CPU 6: \0CPU 7: \0"
> > +1903: .asciz "\string"
> > + .align
> > +1901: adr r0, 1902b
> > + add r0, r0, \cpu, lsl #3
> > + bl printascii
> > + adr r0, 1903b
> > + bl printascii
> > +#endif
> > + .endm
> > +
> > + .arm
> > + .align
> > +
> > +ENTRY(bL_entry_point)
> > +
> > + THUMB( adr r12, BSYM(1f) )
> > + THUMB( bx r12 )
> > + THUMB( .thumb )
> > +1:
> > + mrc p15, 0, r0, c0, c0, 5
> > + ubfx r9, r0, #0, #4 @ r9 = cpu
> > + ubfx r10, r0, #8, #4 @ r10 = cluster
> > + mov r3, #BL_CPUS_PER_CLUSTER
> > + mla r4, r3, r10, r9 @ r4 = canonical CPU index
> > + cmp r4, #(BL_CPUS_PER_CLUSTER * BL_NR_CLUSTERS)
> > + blo 2f
> > +
> > + /* We didn't expect this CPU. Try to make it quiet. */
> > +1: wfi
> > + wfe
> > + b 1b
>
> I realise this CPU is stuck at this point, but you should have a dsb
> before a wfi instruction. This could be problematic with the CCI this
> early, so maybe just a comment saying that it doesn't matter because we
> don't care about this core?
Why a dsb? No data was even touched at this point. And since this is
meant to be a better "b ." kind of loop, I'd rather not try to make it
more sophisticated than it already is. And of course it is meant to
never be executed in practice.
> > +
> > +2: pr_dbg r4, "kernel bL_entry_point\n"
> > +
> > + /*
> > + * MMU is off so we need to get to bL_entry_vectors in a
> > + * position independent way.
> > + */
> > + adr r5, 3f
> > + ldr r6, [r5]
> > + add r6, r5, r6 @ r6 = bL_entry_vectors
> > +
> > +bL_entry_gated:
> > + ldr r5, [r6, r4, lsl #2] @ r5 = CPU entry vector
> > + cmp r5, #0
> > + wfeeq
> > + beq bL_entry_gated
> > + pr_dbg r4, "released\n"
> > + bx r5
> > +
> > + .align 2
> > +
> > +3: .word bL_entry_vectors - .
> > +
> > +ENDPROC(bL_entry_point)
> > +
> > + .bss
> > + .align 5
> > +
> > + .type bL_entry_vectors, #object
> > +ENTRY(bL_entry_vectors)
> > + .space 4 * BL_NR_CLUSTERS * BL_CPUS_PER_CLUSTER
>
> Is there a particular reason to put this in the bss?
Yes, to have it zero initialized without taking up binary space.
> > diff --git a/arch/arm/include/asm/bL_entry.h b/arch/arm/include/asm/bL_entry.h
> > new file mode 100644
> > index 0000000000..ff623333a1
> > --- /dev/null
> > +++ b/arch/arm/include/asm/bL_entry.h
> > @@ -0,0 +1,35 @@
> > +/*
> > + * arch/arm/include/asm/bL_entry.h
> > + *
> > + * Created by: Nicolas Pitre, April 2012
> > + * Copyright: (C) 2012 Linaro Limited
> > + *
> > + * This program is free software; you can redistribute it and/or modify
> > + * it under the terms of the GNU General Public License version 2 as
> > + * published by the Free Software Foundation.
> > + */
> > +
> > +#ifndef BL_ENTRY_H
> > +#define BL_ENTRY_H
> > +
> > +#define BL_CPUS_PER_CLUSTER 4
> > +#define BL_NR_CLUSTERS 2
>
> Hmm, I see these have to be constant so you can allocate your space in
> the assembly file. In which case, I think it's worth changing their
> names to have MAX or LIMIT in them...
Yes, good point. I'll change them.
> maybe they could even be CONFIG options?
Nah. I prefer not adding new config options unless this is really
necessary or useful. For the forseeable future, we'll see systems with
at most 2 clusters and at most 4 CPUs per cluster. That could easily be
revisited later if that becomes unsuitable for some new systems.
Initially I wanted all those things to be runtime sized in relation with
the TODO item in the commit log. That too can come later.
Nicolas
More information about the linux-arm-kernel
mailing list