tlbi va, vaa vs. val, vaal

Mario Smarduch m.smarduch at samsung.com
Fri Feb 27 13:15:57 PST 2015


On 02/27/2015 02:24 AM, Will Deacon wrote:
> On Fri, Feb 27, 2015 at 12:12:32AM +0000, Mario Smarduch wrote:
>> I noticed kernel tlbflush.h use tlbi va*, vaa* variants instead of
>> val, vaal ones. Reading the manual D.5.7.2 it appears that
>> va*, vaa* versions invalidate intermediate caching of
>> translation structures.
>>
>> With stage2 enabled that may result in 20+ memory lookups
>> for a 4 level page table walk. That's assuming that intermediate
>> caching structures cache mappings from stage1 table entry to
>> host page.
> 
> Yeah, Catalin and I discussed improving the kernel support for this,
> but it requires some changes to the generic mmu_gather code so that we
> can distinguish the leaf cases. I'd also like to see that done in a way
> that takes into account different granule sizes (we currently iterate
> over huge pages in 4k chunks). Last time I touched that, I entered a
> world of pain and don't plan to return there immediately :)
> 
> Catalin -- feeling brave?
> 
> FWIW: the new IOMMU page-table stuff I just got merged *does* make use
> of leaf-invalidation for the SMMU.
> 
> Will
> 
Hi Will,
  thanks for the background. I'm guessing how much of PTWalk
is cached is implementation dependent. One old paper quotes upto 40%
improvement for some industry benchmarks that cache all stage1/2 PTWalk
entries.
I guess something to benchmark.

- Mario







More information about the linux-arm-kernel mailing list