[RFC PATCH 2/3] KVM: arm64: Fix handling of merging tables into a block entry

Will Deacon will at kernel.org
Mon Nov 30 08:34:21 EST 2020


On Mon, Nov 30, 2020 at 08:18:46PM +0800, Yanan Wang wrote:
> In dirty logging case(logging_active == True), we need to collapse a block
> entry into a table if necessary. After dirty logging is canceled, when merging
> tables back into a block entry, we should not only free the non-huge page
> tables but also unmap the non-huge mapping for the block. Without the unmap,
> inconsistent TLB entries for the pages in the the block will be created.
> 
> We could also use unmap_stage2_range API to unmap the non-huge mapping,
> but this could potentially free the upper level page-table page which
> will be useful later.
> 
> Signed-off-by: Yanan Wang <wangyanan55 at huawei.com>
> ---
>  arch/arm64/kvm/hyp/pgtable.c | 15 +++++++++++++--
>  1 file changed, 13 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
> index 696b6aa83faf..fec8dc9f2baa 100644
> --- a/arch/arm64/kvm/hyp/pgtable.c
> +++ b/arch/arm64/kvm/hyp/pgtable.c
> @@ -500,6 +500,9 @@ static int stage2_map_walk_table_pre(u64 addr, u64 end, u32 level,
>  	return 0;
>  }
>  
> +static void stage2_flush_dcache(void *addr, u64 size);
> +static bool stage2_pte_cacheable(kvm_pte_t pte);
> +
>  static int stage2_map_walk_leaf(u64 addr, u64 end, u32 level, kvm_pte_t *ptep,
>  				struct stage2_map_data *data)
>  {
> @@ -507,9 +510,17 @@ static int stage2_map_walk_leaf(u64 addr, u64 end, u32 level, kvm_pte_t *ptep,
>  	struct page *page = virt_to_page(ptep);
>  
>  	if (data->anchor) {
> -		if (kvm_pte_valid(pte))
> +		if (kvm_pte_valid(pte)) {
> +			kvm_set_invalid_pte(ptep);
> +			kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, data->mmu,
> +				     addr, level);
>  			put_page(page);

This doesn't make sense to me: the page-table pages we're walking when the
anchor is set are not accessible to the hardware walker because we unhooked
the entire sub-table in stage2_map_walk_table_pre(), which has the necessary
TLB invalidation.

Are you seeing a problem in practice here?

> +			if (stage2_pte_cacheable(pte))
> +				stage2_flush_dcache(kvm_pte_follow(pte),
> +						    kvm_granule_size(level));

I don't understand the need for the flush either, as we're just coalescing
existing entries into a larger block mapping.

Will



More information about the linux-arm-kernel mailing list