[RFC PATCH 6/7] arm64: KVM: Handle trappable TLB instructions

Punit Agrawal punit.agrawal at arm.com
Thu Sep 1 11:29:37 PDT 2016


Will Deacon <will.deacon at arm.com> writes:

> On Fri, Aug 26, 2016 at 10:37:08AM +0100, Punit Agrawal wrote:
>> > Will Deacon <will.deacon at arm.com> writes:
>> >> The easiest thing to do is just TLBI VMALLE1IS for all trapped operations,
>> >> but you might want to see how that performs.
>> >
>> > That sounds reasonable for correctness. But I suspect we'll have to do
>> > more to claw back some performance. Let me run a few tests and come back
>> > on this.
>> 
>> Assuming I've correctly switched in TCR and replacing the various TLB
>> operations in this patch with TLBI VMALLE1IS, there is a drop in kernel
>> build times of ~5% (384s vs 363s).
>
> What do you mean by "switched in TCR"? Why is that necessary if you just
> nuke the whole thing?

You're right. it's not necessary. I'd misunderstood how TCR affects
things and was switching it in the above tests.

> Is the ~5% relative to no trapping at all, or
> trapping, but being selective about the operation?

The reported number was relative to trapping and being selective about
the operation. But I hadn't been careful in ensuring identical
conditions (page caches, etc.) when running the numbers.

So I've done a fresh set of identical measurements by running "time make
-j 7" in a VM booted with 7 vcpus and see the following results

1. no trapping ~ 365s
2. traps using selective tlb operations ~ 371s
3. traps that nuke all stage 1 (tlbi vmalle1is) ~ 393s

So based on these measurements there is ~1% and ~7.5% drop in comparison
between 2. and 3. compared to the base case of no trapping at all.

Thanks,
Punit

>
> Will
> _______________________________________________
> kvmarm mailing list
> kvmarm at lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm



More information about the linux-arm-kernel mailing list