[PATCH 2/3] arm64, mm: Use flush_tlb_all_local() in flush_context().

David Daney ddaney.cavm at gmail.com
Sat Jul 11 13:25:22 PDT 2015

From: David Daney <david.daney at cavium.com>

When CONFIG_SMP, we end up calling flush_context() on each CPU
(indirectly) from __new_context().  Because of this, doing a broadcast
TLB invalidate is overkill, as all CPUs will be doing a local

Change the scope of the TLB invalidation operation to be local,
resulting in nr_cpus invalidations, rather than nr_cpus^2.

On CPUs with a large ASID space this operation is not often done.
But, when it is, this reduces the overhead.

Benchmarked "time make -j48" kernel build with and without the patch on
Cavium ThunderX system, one run to warm up the caches, and then five
runs measured:

original      with-patch
139.299s      139.0766s
S.D. 0.321    S.D. 0.159

Probably a little faster, but could be measurement noise.

Signed-off-by: David Daney <david.daney at cavium.com>
 arch/arm64/mm/context.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c
index 76c1e6c..ab5b8d3 100644
--- a/arch/arm64/mm/context.c
+++ b/arch/arm64/mm/context.c
@@ -48,7 +48,7 @@ static void flush_context(void)
 	/* set the reserved TTBR0 before flushing the TLB */
-	flush_tlb_all();
+	flush_tlb_all_local();
 	if (icache_is_aivivt())

More information about the linux-arm-kernel mailing list