[RFC PATCH] KVM: arm64: Fix double-free following kvm_pgtable_stage2_free_unlinked()

Oliver Upton oliver.upton at linux.dev
Mon Feb 12 12:14:37 PST 2024


On Mon, Feb 12, 2024 at 07:30:52PM +0000, Will Deacon wrote:
> kvm_pgtable_stage2_free_unlinked() does the final put_page() on the
> root page of the sub-tree before returning, so remove the additional
> put_page() invocations in the callers.
> 
> Cc: Marc Zyngier <maz at kernel.org>
> Cc: Oliver Upton <oliver.upton at linux.dev>
> Cc: Ricardo Koller <ricarkol at google.com>
> Signed-off-by: Will Deacon <will at kernel.org>
> ---
> 
> Hi folks,
> 
> Sending this as an RFC as I only spotted it from code inspection and I'm
> surprised others aren't seeing fireworks if it's a genuine bug. I also
> couldn't come up with a sensible Fixes tag, as all of:
> 
>  e7c05540c694b ("KVM: arm64: Add helper for creating unlinked stage2 subtrees")
>  8f5a3eb7513fc ("KVM: arm64: Add kvm_pgtable_stage2_split()")
>  f6a27d6dc51b2 ("KVM: arm64: Drop last page ref in kvm_pgtable_stage2_free_removed()")
> 
> are actually ok in isolation. Hrm. Please tell me I'm wrong?
> 
>  arch/arm64/kvm/hyp/pgtable.c | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
> index c651df904fe3..ab9d05fcf98b 100644
> --- a/arch/arm64/kvm/hyp/pgtable.c
> +++ b/arch/arm64/kvm/hyp/pgtable.c
> @@ -1419,7 +1419,6 @@ kvm_pte_t *kvm_pgtable_stage2_create_unlinked(struct kvm_pgtable *pgt,
>  				 level + 1);
>  	if (ret) {
>  		kvm_pgtable_stage2_free_unlinked(mm_ops, pgtable, level);
> -		mm_ops->put_page(pgtable);
>  		return ERR_PTR(ret);
>  	}

AFAICT, this entire branch is effectively dead code, unless there's a
KVM bug lurking behind the page table walk. The sub-tree isn't visible
to other software or hardware walkers yet, so none of the PTE races
could cause this to pop.

So while this is very obviously a bug, it might be pure luck that folks
haven't seen smoke here. Perhaps while fixing the bug we should take the
opportunity to promote the condition to WARN_ON_ONCE().

> @@ -1502,7 +1501,6 @@ static int stage2_split_walker(const struct kvm_pgtable_visit_ctx *ctx,
>  
>  	if (!stage2_try_break_pte(ctx, mmu)) {
>  		kvm_pgtable_stage2_free_unlinked(mm_ops, childp, level);
> -		mm_ops->put_page(childp);
>  		return -EAGAIN;
>  	}

This, on the other hand, seems possible. There exists a race where an
old block PTE could have the AF set on it and the underlying cmpxchg()
could fail. There shouldn't be a race with any software walkers, as we
hold the MMU lock for write here.

-- 
Thanks,
Oliver



More information about the linux-arm-kernel mailing list