[PATCH v3 00/21] KVM: arm64: Rewrite page-table code and fault handling

Gavin Shan gshan at redhat.com
Thu Sep 3 07:13:40 EDT 2020


Hi Will,

On 9/3/20 5:34 PM, Gavin Shan wrote:
> On 8/25/20 7:39 PM, Will Deacon wrote:
>> Hello folks,
>>
>> This is version three of the KVM page-table rework that I previously posted
>> here:
>>
>>    v1: https://lore.kernel.org/r/20200730153406.25136-1-will@kernel.org
>>    v2: https://lore.kernel.org/r/20200818132818.16065-1-will@kernel.org
>>
>> Changes since v2 include:
>>
>>    * Rebased onto -rc2, which includes the conflicting OOM blocking fixes
>>    * Dropped the patch trying to "fix" the memcache in kvm_phys_addr_ioremap()
>>
> 
> It's really nice work, making the code unified/simplified greatly.
> However, it seems it doesn't work well with HugeTLBfs. Please refer
> to the following test result and see if you have quick idea, or I
> can debug it a bit :)
> 
> 
> Machine         Host                     Guest              Result
> ===============================================================
> ThunderX2    VA_BITS:   42           PAGE_SIZE:  4KB     Passed
>               PAGE_SIZE: 64KB                    64KB     passed
>               THP:       disabled
>               HugeTLB:   disabled
> ---------------------------------------------------------------
> ThunderX2    VA_BITS:   42           PAGE_SIZE:  4KB     Passed
>               PAGE_SIZE: 64KB                    64KB     passed
>               THP:       enabled
>               HugeTLB:   disabled
> ----------------------------------------------------------------
> ThunderX2    VA_BITS:   42           PAGE_SIZE:  4KB     Fail[1]
>               PAGE_SIZE: 64KB                    64KB     Fail[1]
>               THP:       disabled
>               HugeTLB:   enabled
> ---------------------------------------------------------------
> ThunderX2    VA_BITS:   39           PAGE_SIZE:  4KB     Passed
>               PAGE_SIZE: 4KB                     64KB     Passed
>               THP:       disabled
>               HugeTLB:   disabled
> ---------------------------------------------------------------
> ThunderX2    VA_BITS:   39           PAGE_SIZE:  4KB     Passed
>               PAGE_SIZE: 4KB                     64KB     Passed
>               THP:       enabled
>               HugeTLB:   disabled
> --------------------------------------------------------------
> ThunderX2    VA_BITS:   39           PAGE_SIZE: 4KB     Fail[2]
>               PAGE_SIZE: 4KB                    64KB     Fail[2]
>               THP:       disabled
>               HugeTLB:   enabled
> 

I debugged the code and found the issue is caused by the following
patch.

[PATCH v3 06/21] KVM: arm64: Add support for stage-2 map()/unmap() in generic page-table

With the following code changes applied on top of this series, no
host kernel crash found and hugetlbfs works for me. However, I don't
think it's correct fix to have. I guess we still want to invalidate
the page table entry (at level#2 when PAGE_SIZE is 64KB on host) in
stage2_map_walk_table_pre() as we're going to cut off the branch to
the subordinate tables/entries. However, stage2_map_walk_table_post()
still need the original page table entry to release the subordinate
page properly. So I guess the proper fix would be to cache the original
page table entry in advance, or you might have better idea :)

I will also reply to PATCH[06/21] to to make the reply chain complete.

diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
index 6e8ca1ec12b4..f4eacfdd73cb 100644
--- a/arch/arm64/kvm/hyp/pgtable.c
+++ b/arch/arm64/kvm/hyp/pgtable.c
@@ -494,8 +494,8 @@ static int stage2_map_walk_table_pre(u64 addr, u64 end, u32 level,
         if (!kvm_block_mapping_supported(addr, end, data->phys, level))
                 return 0;
  
-       kvm_set_invalid_pte(ptep);
-       kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, data->mmu, addr, 0);
+       //kvm_set_invalid_pte(ptep);
+       //kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, data->mmu, addr, 0);
         data->anchor = ptep;
         return 0;
  }

For the initial debugging, I add some printk around and get the following
output, for FYI. It indicates we're releasing page at physical address
0x0 and obviously incorrect.

    [  111.586180] stage2_map_walk_table_post: addr=0x40000000, end=0x60000000, level=2, anchor at 0xfffffc0f191c0010, ptep at 0xfffffc0f191c0010

    static int stage2_map_walk_table_post(u64 addr, u64 end, u32 level,
         if (!data->anchor)
                 return 0;
  
+       if (*ptep == 0x0) {
+               pr_warn("%s: addr=0x%llx, end=0x%llx, level=%d, anchor at 0x%lx, ptep at 0x%lx\n",
+                        __func__, addr, end, level, (unsigned long)(data->anchor),
+                       (unsigned long)ptep);
+       }
+
         free_page((unsigned long)kvm_pte_follow(*ptep));
         put_page(virt_to_page(ptep));

By the way, I've finished the code review. I leave those nVHE patches to Alex for his
review. I think the testing is also finished until you need me to have more testing.
With the issue fixed, feel free to add for this series:

Tested-by: Gavin Shan <gshan at redhat.com>

Thanks,
Gavin




More information about the linux-arm-kernel mailing list