[RFC PATCH 1/3] KVM: arm64: Fix possible memory leak in kvm stage2

Will Deacon will at kernel.org
Tue Dec 1 13:15:11 EST 2020


On Wed, Dec 02, 2020 at 01:19:35AM +0800, wangyanan (Y) wrote:
> On 2020/12/1 22:16, Will Deacon wrote:
> > On Tue, Dec 01, 2020 at 03:21:23PM +0800, wangyanan (Y) wrote:
> > > On 2020/11/30 21:21, Will Deacon wrote:
> > > > On Mon, Nov 30, 2020 at 08:18:45PM +0800, Yanan Wang wrote:
> > > > > @@ -476,6 +477,7 @@ static bool stage2_map_walker_try_leaf(u64 addr, u64 end, u32 level,
> > > > >    	/* There's an existing valid leaf entry, so perform break-before-make */
> > > > >    	kvm_set_invalid_pte(ptep);
> > > > >    	kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, data->mmu, addr, level);
> > > > > +	put_page(virt_to_page(ptep));
> > > > >    	kvm_set_valid_leaf_pte(ptep, phys, data->attr, level);
> > > > >    out:
> > > > >    	data->phys += granule;
> > > > Isn't this hunk alone sufficient to solve the problem?
> > > > 
> > > Not sufficient enough. When the old ptep is valid and old pte equlas new
> > > pte, in this case, "True" is also returned by kvm_set_valid_leaf_pte()
> > > 
> > > and get_page() will still be called.
> > I had a go at fixing this without introducing refcounting to the hyp stage-1
> > case, and ended up with the diff below. What do you think?
> 
> Functionally this diff looks fine to me. A small comment inline, please see
> below.
> 
> I had made an alternative fix (after sending v1) and it looks much more
> concise.
> 
> If you're ok with it, I can send it as v2 (together with patch#2 and #3)
> after some tests.

Thanks.

> diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
> index 0271b4a3b9fe..b232bdd142a6 100644
> --- a/arch/arm64/kvm/hyp/pgtable.c
> +++ b/arch/arm64/kvm/hyp/pgtable.c
> @@ -470,6 +470,9 @@ static bool stage2_map_walker_try_leaf(u64 addr, u64
> end, u32 level,
>         if (!kvm_block_mapping_supported(addr, end, phys, level))
>                 return false;
> 
> +       if (kvm_pte_valid(*ptep))
> +               put_page(virt_to_page(ptep));
> +
>         if (kvm_set_valid_leaf_pte(ptep, phys, data->attr, level))
>                 goto out;

This is certainly smaller, but see below.

> > diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
> > index 0271b4a3b9fe..78e2c0dc47ae 100644
> > --- a/arch/arm64/kvm/hyp/pgtable.c
> > +++ b/arch/arm64/kvm/hyp/pgtable.c
> > @@ -170,23 +170,16 @@ static void kvm_set_table_pte(kvm_pte_t *ptep, kvm_pte_t *childp)
> >   	smp_store_release(ptep, pte);
> >   }
> > -static bool kvm_set_valid_leaf_pte(kvm_pte_t *ptep, u64 pa, kvm_pte_t attr,
> > -				   u32 level)
> > +static kvm_pte_t kvm_init_valid_leaf_pte(u64 pa, kvm_pte_t attr, u32 level)
> >   {
> > -	kvm_pte_t old = *ptep, pte = kvm_phys_to_pte(pa);
> > +	kvm_pte_t pte = kvm_phys_to_pte(pa);
> >   	u64 type = (level == KVM_PGTABLE_MAX_LEVELS - 1) ? KVM_PTE_TYPE_PAGE :
> >   							   KVM_PTE_TYPE_BLOCK;
> >   	pte |= attr & (KVM_PTE_LEAF_ATTR_LO | KVM_PTE_LEAF_ATTR_HI);
> >   	pte |= FIELD_PREP(KVM_PTE_TYPE, type);
> >   	pte |= KVM_PTE_VALID;
> > -
> > -	/* Tolerate KVM recreating the exact same mapping. */
> > -	if (kvm_pte_valid(old))
> > -		return old == pte;
> > -
> > -	smp_store_release(ptep, pte);
> > -	return true;
> > +	return pte;
> >   }
> >   static int kvm_pgtable_visitor_cb(struct kvm_pgtable_walk_data *data, u64 addr,
> > @@ -341,12 +334,17 @@ static int hyp_map_set_prot_attr(enum kvm_pgtable_prot prot,
> >   static bool hyp_map_walker_try_leaf(u64 addr, u64 end, u32 level,
> >   				    kvm_pte_t *ptep, struct hyp_map_data *data)
> >   {
> > +	kvm_pte_t new, old = *ptep;
> >   	u64 granule = kvm_granule_size(level), phys = data->phys;
> >   	if (!kvm_block_mapping_supported(addr, end, phys, level))
> >   		return false;
> > -	WARN_ON(!kvm_set_valid_leaf_pte(ptep, phys, data->attr, level));
> > +	/* Tolerate KVM recreating the exact same mapping. */
> > +	new = kvm_init_valid_leaf_pte(phys, data->attr, level);
> > +	if (old != new && !WARN_ON(kvm_pte_valid(old)))
> > +		smp_store_release(ptep, new);
> > +
> >   	data->phys += granule;
> >   	return true;
> >   }
> > @@ -465,19 +463,24 @@ static bool stage2_map_walker_try_leaf(u64 addr, u64 end, u32 level,
> >   				       kvm_pte_t *ptep,
> >   				       struct stage2_map_data *data)
> >   {
> > +	kvm_pte_t new, old = *ptep;
> >   	u64 granule = kvm_granule_size(level), phys = data->phys;
> >   	if (!kvm_block_mapping_supported(addr, end, phys, level))
> >   		return false;
> > -	if (kvm_set_valid_leaf_pte(ptep, phys, data->attr, level))
> > -		goto out;
> > +	new = kvm_init_valid_leaf_pte(phys, data->attr, level);
> > +	if (kvm_pte_valid(old)) {
> > +		/*
> > +		 * There's an existing valid leaf entry, so perform
> > +		 * break-before-make.
> > +		 */
> > +		kvm_set_invalid_pte(ptep);
> > +		kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, data->mmu, addr, level);
> > +		put_page(virt_to_page(ptep));
> 
> When old PTE is valid and equals to the new one, we will also perform
> break-before-make and the new PTE installation. But they're actually not
> necessary in this case.

Agreed, but I don't think that case should happen for stage-2 mappings.
That's why I prefer my diff here, as it makes that 'old == new' logic
specific to the hyp mappings.

But I'll leave it all up to you (these diffs are only intended to be
helpful).

Will



More information about the linux-arm-kernel mailing list