[PATCH V2] arm64: add DSB after icache flush in __flush_icache_all()
Russell King - ARM Linux
linux at arm.linux.org.uk
Fri Jan 31 05:48:50 EST 2014
On Fri, Jan 31, 2014 at 12:16:48AM +0000, Will Deacon wrote:
> Hi Nico,
>
> On Thu, Jan 30, 2014 at 09:42:29PM +0000, Nicolas Pitre wrote:
> > On Thu, 30 Jan 2014, Will Deacon wrote:
> > > On Thu, Jan 30, 2014 at 06:04:43AM +0000, Vinayak Kale wrote:
> > > > Can you please elaborate whether you are referring to lack of memory
> > > > clobber or missing barriers?
> > >
> > > The clobbers. For example:
> > >
> > > arch/arm64/kvm/sys_regs.c:
> > >
> > > /* Make sure noone else changes CSSELR during this! */
> > > local_irq_disable();
> > > /* Put value into CSSELR */
> > > asm volatile("msr csselr_el1, %x0" : : "r" (csselr));
> > > isb();
> > > /* Read result out of CCSIDR */
> > > asm volatile("mrs %0, ccsidr_el1" : "=r" (ccsidr));
> > > local_irq_enable();
> > >
> > > Just about everything can be re-ordered in that block, because the asm
> > > volatile statements don't have "memory" clobbers.
> >
> > I don't think they would be reordered at all with the
> > volatile qualifiers.
>
> Whilst that may be the case in current compilers (i.e. I've not actually
> seen the above sequence get re-ordered), the GCC documentation states that:
>
> Similarly, you can't expect a sequence of volatile asm instructions to remain
> perfectly consecutive. If you want consecutive output, use a single asm. Also,
> GCC performs some optimizations across a volatile asm instruction; GCC does not
> `forget everything' when it encounters a volatile asm instruction the way some
> other compilers do.
>
> so I really think that the "memory" clobbers are needed to ensure strict
> ordering. This matches my understanding from discussions with the compiler
> engineers at ARM.
What it means is that the compiler may introduce additional instructions
between your consecutive asm() statements. So there's no guarantee that
the ISB will immediately follow the MSR instruction - there may be other
instructions which the compiler may decide to schedule between the two.
For example, instructions to load the address of the variable(s) may be
inserted between the assembly specified in the asm() statements which
may involve loading from a literal pool.
--
FTTC broadband for 0.8mile line: 5.8Mbps down 500kbps up. Estimation
in database were 13.1 to 19Mbit for a good line, about 7.5+ for a bad.
Estimate before purchase was "up to 13.2Mbit".
More information about the linux-arm-kernel
mailing list