[PATCH] KVM: arm64: Reload PTE after invoking walker callback on preorder traversal
Marc Zyngier
maz at kernel.org
Mon May 22 03:48:38 PDT 2023
Hi Fuad,
On Mon, 22 May 2023 11:32:58 +0100,
Fuad Tabba <tabba at google.com> wrote:
>
> The preorder callback on the kvm_pgtable_stage2_map() path can replace
> a table with a block, then recursively free the detached table. The
> higher-level walking logic stashes the old page table entry and
> then walks the freed table, invoking the leaf callback and
> potentially freeing pgtable pages prematurely.
>
> In normal operation, the call to tear down the detached stage-2
> is indirected and uses an RCU callback to trigger the freeing.
> RCU is not available to pKVM, which is where this bug is
> triggered.
>
> Change the behavior of the walker to reload the page table entry
> after invoking the walker callback on preorder traversal, as it
> does for leaf entries.
Thanks for the fix and the detailed explanation. A couple of nits,
none of which deserve a respin on their own (I can fix up things when
applying the patch).
>
> Tested on Pixel 6.
>
> Fixes: 5c359cca1faf ("KVM: arm64: Tear down unlinked stage-2 subtree after break-before-make")
>
Spurious empty line. In general, please keep the trailers grouped
together, as it otherwise tends to confuse git-interpret-trailers.
> Suggested-by: Oliver Upton <oliver.upton at linux.dev>
> Signed-off-by: Fuad Tabba <tabba at google.com>
>
> ---
>
> Based on: f1fcbaa18b28 (6.4-rc2)
>
> The bug can be triggered by applying Will's FFA series [1] to
> android mainline [2] and booting a Pixel 6 in protected mode
> (pKVM).
>
> [1] 20230419122051.1341-1-will at kernel.org
> [2] https://android.googlesource.com/kernel/common/+/refs/tags/android-mainline-6.3
> ---
> arch/arm64/include/asm/kvm_pgtable.h | 6 +++---
> arch/arm64/kvm/hyp/pgtable.c | 14 +++++++++++++-
> 2 files changed, 16 insertions(+), 4 deletions(-)
>
> diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h
> index 4cd6762bda80..3664f1d85ce6 100644
> --- a/arch/arm64/include/asm/kvm_pgtable.h
> +++ b/arch/arm64/include/asm/kvm_pgtable.h
> @@ -631,9 +631,9 @@ int kvm_pgtable_stage2_flush(struct kvm_pgtable *pgt, u64 addr, u64 size);
> *
> * The walker will walk the page-table entries corresponding to the input
> * address range specified, visiting entries according to the walker flags.
> - * Invalid entries are treated as leaf entries. Leaf entries are reloaded
> - * after invoking the walker callback, allowing the walker to descend into
> - * a newly installed table.
> + * Invalid entries are treated as leaf entries. The visited page table entry is
> + * reloaded after invoking the walker callback, allowing the walker to descend
> + * into a newly installed table.
> *
> * Returning a negative error code from the walker callback function will
> * terminate the walk immediately with the same error code.
> diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
> index 3d61bd3e591d..120c49d52ca0 100644
> --- a/arch/arm64/kvm/hyp/pgtable.c
> +++ b/arch/arm64/kvm/hyp/pgtable.c
> @@ -207,14 +207,26 @@ static inline int __kvm_pgtable_visit(struct kvm_pgtable_walk_data *data,
> .flags = flags,
> };
> int ret = 0;
> + bool reload = false;
> kvm_pteref_t childp;
> bool table = kvm_pte_table(ctx.old, level);
>
> - if (table && (ctx.flags & KVM_PGTABLE_WALK_TABLE_PRE))
> + if (table && (ctx.flags & KVM_PGTABLE_WALK_TABLE_PRE)) {
> ret = kvm_pgtable_visitor_cb(data, &ctx, KVM_PGTABLE_WALK_TABLE_PRE);
> + reload = true;
> + }
>
> if (!table && (ctx.flags & KVM_PGTABLE_WALK_LEAF)) {
> ret = kvm_pgtable_visitor_cb(data, &ctx, KVM_PGTABLE_WALK_LEAF);
> + reload = true;
> + }
From these two clauses, it is clear that reload is always the value of
(ctx.flags & KVM_PGTABLE_WALK_LEAF). That'd simplify the patch a bit.
> +
> + /*
> + * Reload the page table after invoking the walker callback for leaf
> + * entries or after pre-order traversal, to allow the walker to descend
> + * into a newly installed or replaced table.
> + */
> + if (reload) {
> ctx.old = READ_ONCE(*ptep);
> table = kvm_pte_table(ctx.old, level);
> }
>
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
More information about the linux-arm-kernel
mailing list