Unnecessary cache-line flush on page table updates ?
Catalin Marinas
catalin.marinas at arm.com
Mon Jul 4 06:43:29 EDT 2011
On Mon, Jul 04, 2011 at 11:02:21AM +0100, Russell King - ARM Linux wrote:
> On Mon, Jul 04, 2011 at 10:45:32AM +0100, Catalin Marinas wrote:
> > Given these results, I think it's worth merging the patch. Can I add
> > your Tested-by?
>
> If we're going to make this change to use the coherent information,
> let's address all the places in one go, which are:
>
> clean_pmd_entry
> flush_pmd_entry
> clean_pte_table
>
> These require a little more thought as we aren't guaranteed to have
> ID_MMFR3 in place - maybe they should be callbacks into the per-CPU
> code.
For the first two, can we not clear the TLB_DCLEAN bit in
__cpu_tlb_flags (only a single check at boot time?).
For the last one, we could add a tlb_flags() check.
> It would also be a good idea to change the comment from "flush_pte"
> to "Clean data cache to PoU".
OK.
> > I think there can be a few other optimisations in the TLB area but it
> > needs some digging.
>
> The single-TLB model works fairly well, but as I thought the lack of
> mcr%? processing by GCC makes the asm/tlbflush.h code fairly disgusting
> even for a v6+v7 kernel. Luckily, we can play some tricks and sort
> some of that out. The patch below is not complete (and can result in
> some rules of the architecture being violated - namely the requirement
> for an ISB after the BTB flush without a branch between) but it
> illustrates the idea:
I'm not sure about this rule, I can ask for some clarification (we are
not changing the memory map of the branch we execute).
According to the ARM ARM, the TLB invalidation sequence should be:
STR rx, [Translation table entry] ; write new entry to the translation table
Clean cache line [Translation table entry] ; This operation is not required with the
; Multiprocessing Extensions.
DSB ; ensures visibility of the data cleaned from the D Cache
Invalidate TLB entry by MVA (and ASID if non-global) [page address]
Invalidate BTC
DSB ; ensure completion of the Invalidate TLB operation
ISB ; ensure table changes visible to instruction fetch
So we have a DSB and ISB (when we don't return to user) unconditionally.
Starting with Cortex-A8 (well, unless you enable some errata
workarounds), the BTB is a no-op anyway, so maybe we could have the BTB
unconditionally as well (for ARMv6/v7). The only problem is the inner
shareable if we are on SMP - maybe we can do some run-time code patching
for it.
> diff --git a/arch/arm/include/asm/tlbflush.h b/arch/arm/include/asm/tlbflush.h
> index d2005de..252874c 100644
> --- a/arch/arm/include/asm/tlbflush.h
> +++ b/arch/arm/include/asm/tlbflush.h
> @@ -34,15 +34,15 @@
> #define TLB_V6_D_ASID (1 << 17)
> #define TLB_V6_I_ASID (1 << 18)
>
> -#define TLB_BTB (1 << 28)
> -
> /* Unified Inner Shareable TLB operations (ARMv7 MP extensions) */
> #define TLB_V7_UIS_PAGE (1 << 19)
> #define TLB_V7_UIS_FULL (1 << 20)
> #define TLB_V7_UIS_ASID (1 << 21)
>
> /* Inner Shareable BTB operation (ARMv7 MP extensions) */
> -#define TLB_V7_IS_BTB (1 << 22)
> +#define TLB_V7_IS_BTB (1 << 26)
> +#define TLB_BTB (1 << 27)
> +#define TLB_BTB_BARRIER (1 << 28)
I won't comment on the BTB changes until we clarify the conditionality
of barriers and BTB.
--
Catalin
More information about the linux-arm-kernel
mailing list