[RFC PATCH 2/3] KVM: arm64: Fix handling of merging tables into a block entry

wangyanan (Y) wangyanan55 at huawei.com
Tue Dec 1 12:20:33 EST 2020


On 2020/12/1 22:35, Marc Zyngier wrote:

> Hi Yanan,
>
> On 2020-12-01 14:11, wangyanan (Y) wrote:
>> On 2020/12/1 21:46, Will Deacon wrote:
>>> On Tue, Dec 01, 2020 at 10:30:41AM +0800, wangyanan (Y) wrote:
>
> [...]
>
>>>> The point is at b.iii where the TLBI is not enough. There are many 
>>>> page
>>>> mappings that we need to merge into a block mapping.
>>>>
>>>> We invalidate the TLB for the input address without level hint at 
>>>> b.iii, but
>>>> this operation just flush TLB for one page mapping, there
>>>>
>>>> are still some TLB entries for the other page mappings in the 
>>>> cache, the MMU
>>>> hardware walker can still hit these entries next time.
>>> Ah, yes, I see. Thanks. I hadn't considered the case where there are 
>>> table
>>> entries beneath the anchor. So how about the diff below?
>>>
>>> Will
>>>
>>> --->8
>>
>> Hi, I think it's inappropriate to put the TLBI of all the leaf entries
>> in function stage2_map_walk_table_post(),
>>
>> because the *ptep must be an upper table entry when we enter
>> stage2_map_walk_table_post().
>>
>> We should make the TLBI for every leaf entry not table entry in the
>> last lookup level,  just as I am proposing
>>
>> to add the additional TLBI in function stage2_map_walk_leaf().
>
> Could you make your concerns explicit? As far as I can tell, this should
> address the bug you found, at least from a correctness perspective.
>
> Are you worried about the impact of the full S2 invalidation? Or do you
> see another correctness issue?


Hi Will, Marc,


After recheck of the diff, the full S2 invalidation in 
stage2_map_walk_table_post() should be well enough to solve this problem.

But I was wondering if we can add the full S2 invalidation in 
stage2_map_walk_table_pre(), where __kvm_tlb_flush_vmid() will be called 
for only one time.

If we add the full TLBI in stage2_map_walk_table_post(), 
__kvm_tlb_flush_vmid() might be called for many times in the loop and 
lots of (unnecessary) CPU instructions will be wasted.

What I'm saying is something like below, please let me know what do you 
think.

If this is OK, I can update the diff in v2 and send it with your SOB (is 
it appropriate?) after some tests.


diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
index b232bdd142a6..f11fb2996080 100644
--- a/arch/arm64/kvm/hyp/pgtable.c
+++ b/arch/arm64/kvm/hyp/pgtable.c
@@ -496,7 +496,7 @@ static int stage2_map_walk_table_pre(u64 addr, u64 
end, u32 level,
                 return 0;

         kvm_set_invalid_pte(ptep);
-       kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, data->mmu, addr, 0);
+       kvm_call_hyp(__kvm_tlb_flush_vmid, data->mmu);
         data->anchor = ptep;
         return 0;
  }


Thanks,

Yanan




More information about the linux-arm-kernel mailing list