[PATCH] KVM: arm64: Reload PTE after invoking walker callback on preorder traversal

Marc Zyngier maz at kernel.org
Mon May 22 03:48:38 PDT 2023


Hi Fuad,

On Mon, 22 May 2023 11:32:58 +0100,
Fuad Tabba <tabba at google.com> wrote:
> 
> The preorder callback on the kvm_pgtable_stage2_map() path can replace
> a table with a block, then recursively free the detached table. The
> higher-level walking logic stashes the old page table entry and
> then walks the freed table, invoking the leaf callback and
> potentially freeing pgtable pages prematurely.
> 
> In normal operation, the call to tear down the detached stage-2
> is indirected and uses an RCU callback to trigger the freeing.
> RCU is not available to pKVM, which is where this bug is
> triggered.
> 
> Change the behavior of the walker to reload the page table entry
> after invoking the walker callback on preorder traversal, as it
> does for leaf entries.

Thanks for the fix and the detailed explanation. A couple of nits,
none of which deserve a respin on their own (I can fix up things when
applying the patch).

> 
> Tested on Pixel 6.
> 
> Fixes: 5c359cca1faf ("KVM: arm64: Tear down unlinked stage-2 subtree after break-before-make")
> 

Spurious empty line. In general, please keep the trailers grouped
together, as it otherwise tends to confuse git-interpret-trailers.

> Suggested-by: Oliver Upton <oliver.upton at linux.dev>
> Signed-off-by: Fuad Tabba <tabba at google.com>
> 
> ---
> 
> Based on: f1fcbaa18b28 (6.4-rc2)
> 
> The bug can be triggered by applying Will's FFA series [1] to
> android mainline [2] and booting a Pixel 6 in protected mode
> (pKVM).
> 
> [1] 20230419122051.1341-1-will at kernel.org
> [2] https://android.googlesource.com/kernel/common/+/refs/tags/android-mainline-6.3
> ---
>  arch/arm64/include/asm/kvm_pgtable.h |  6 +++---
>  arch/arm64/kvm/hyp/pgtable.c         | 14 +++++++++++++-
>  2 files changed, 16 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h
> index 4cd6762bda80..3664f1d85ce6 100644
> --- a/arch/arm64/include/asm/kvm_pgtable.h
> +++ b/arch/arm64/include/asm/kvm_pgtable.h
> @@ -631,9 +631,9 @@ int kvm_pgtable_stage2_flush(struct kvm_pgtable *pgt, u64 addr, u64 size);
>   *
>   * The walker will walk the page-table entries corresponding to the input
>   * address range specified, visiting entries according to the walker flags.
> - * Invalid entries are treated as leaf entries. Leaf entries are reloaded
> - * after invoking the walker callback, allowing the walker to descend into
> - * a newly installed table.
> + * Invalid entries are treated as leaf entries. The visited page table entry is
> + * reloaded after invoking the walker callback, allowing the walker to descend
> + * into a newly installed table.
>   *
>   * Returning a negative error code from the walker callback function will
>   * terminate the walk immediately with the same error code.
> diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
> index 3d61bd3e591d..120c49d52ca0 100644
> --- a/arch/arm64/kvm/hyp/pgtable.c
> +++ b/arch/arm64/kvm/hyp/pgtable.c
> @@ -207,14 +207,26 @@ static inline int __kvm_pgtable_visit(struct kvm_pgtable_walk_data *data,
>  		.flags	= flags,
>  	};
>  	int ret = 0;
> +	bool reload = false;
>  	kvm_pteref_t childp;
>  	bool table = kvm_pte_table(ctx.old, level);
>  
> -	if (table && (ctx.flags & KVM_PGTABLE_WALK_TABLE_PRE))
> +	if (table && (ctx.flags & KVM_PGTABLE_WALK_TABLE_PRE)) {
>  		ret = kvm_pgtable_visitor_cb(data, &ctx, KVM_PGTABLE_WALK_TABLE_PRE);
> +		reload = true;
> +	}
>  
>  	if (!table && (ctx.flags & KVM_PGTABLE_WALK_LEAF)) {
>  		ret = kvm_pgtable_visitor_cb(data, &ctx, KVM_PGTABLE_WALK_LEAF);
> +		reload = true;
> +	}

From these two clauses, it is clear that reload is always the value of
(ctx.flags & KVM_PGTABLE_WALK_LEAF). That'd simplify the patch a bit.

> +
> +	/*
> +	 * Reload the page table after invoking the walker callback for leaf
> +	 * entries or after pre-order traversal, to allow the walker to descend
> +	 * into a newly installed or replaced table.
> +	 */
> +	if (reload) {
>  		ctx.old = READ_ONCE(*ptep);
>  		table = kvm_pte_table(ctx.old, level);
>  	}
> 

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.



More information about the linux-arm-kernel mailing list