[PATCH v2 1/3] KVM: arm64: Reset page order in pKVM hyp_pool_init
Vincent Donnefort
vdonnefort at google.com
Thu May 21 06:21:02 PDT 2026
On Thu, May 21, 2026 at 02:07:36PM +0100, Fuad Tabba wrote:
> On Thu, 21 May 2026 at 11:22, Vincent Donnefort <vdonnefort at google.com> wrote:
> >
> > When a VM fails to initialise after its stage-2 hyp_pool has been
> > initialised, that stage-2 must be torn down entirely. This requires
> > resetting both the refcount and the order of its pages back to 0.
> >
> > Currently, reclaim_pgtable_pages() implicitly resets the page order by
> > allocating the entire pool with order-0 granularity. However, in the VM
> > initialisation error path, the addresses of the donated memory (the PGD)
> > are already known, making it unnecessary to iterate over all pages in
> > the pool.
> >
> > Since the vmemmap page order is a hyp_pool-specific field, leaving a
> > non-zero order on hyp_pool destruction is harmless until another pool
> > attempts to admit the page. Instead of resetting this field during
> > destruction, reset it during pool initialization in hyp_pool_init().
> > Note that pages added to the pool outside of the initial pool range
> > (e.g., via guest_s2_zalloc_page()) must still have their order managed
> > manually.
> >
> > While at it, add a WARN_ON() in the hyp_pool attach path to catch
> > unexpected page orders that exceed the pool's max_order.
> >
> > Fixes: 256b4668cd89 ("KVM: arm64: Introduce separate hypercalls for pKVM VM reservation and initialization")
> > Reported-by: Sashiko <sashiko-bot at kernel.org>
> > Signed-off-by: Vincent Donnefort <vdonnefort at google.com>
> >
> > diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
> > index 25f04629014e..89eb20d4fee4 100644
> > --- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c
> > +++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
> > @@ -322,7 +322,6 @@ void reclaim_pgtable_pages(struct pkvm_hyp_vm *vm, struct kvm_hyp_memcache *mc)
> > while (addr) {
> > page = hyp_virt_to_page(addr);
> > page->refcount = 0;
> > - page->order = 0;
> > push_hyp_memcache(mc, addr, hyp_virt_to_phys);
> > WARN_ON(__pkvm_hyp_donate_host(hyp_virt_to_pfn(addr), 1));
> > addr = hyp_alloc_pages(&vm->pool, 0);
> > diff --git a/arch/arm64/kvm/hyp/nvhe/page_alloc.c b/arch/arm64/kvm/hyp/nvhe/page_alloc.c
> > index a1eb27a1a747..c3b3dc5a8ea7 100644
> > --- a/arch/arm64/kvm/hyp/nvhe/page_alloc.c
> > +++ b/arch/arm64/kvm/hyp/nvhe/page_alloc.c
> > @@ -97,6 +97,8 @@ static void __hyp_attach_page(struct hyp_pool *pool,
> > u8 order = p->order;
> > struct hyp_page *buddy;
> >
> > + WARN_ON(p->order > pool->max_order);
> > +
>
> Could you add a brief comment? It took me a minute to figure out what this
> catches. IIUC it's not attach's own input, it's a stale p->order from way back
> when an external page was popped from a memcache (today only via
> guest_s2_zalloc_page()). Right?
I think it'd be self explanatory if that was next the page_add_to_list, but that
wouldn't protect the memset (that's really best-effort though).
How about?
/*
* A page with an order bigger than the pool's max is an 'external' page
* whose order hasn't been reset before being added to the pool.
*/
But now I am thinking I can do way better: we can easily identify external
pages, so I could just force the order to 0 in that case.
WDYS?
>
> With that.
>
> Reviewed-by: Fuad Tabba <tabba at google.com>
> Tested-by: Fuad Tabba <tabba at google.com>
>
> Cheers,
> /fuad
>
>
>
>
> > memset(hyp_page_to_virt(p), 0, PAGE_SIZE << p->order);
> >
> > /* Skip coalescing for 'external' pages being freed into the pool. */
> > @@ -237,8 +239,10 @@ int hyp_pool_init(struct hyp_pool *pool, u64 pfn, unsigned int nr_pages,
> >
> > /* Init the vmemmap portion */
> > p = hyp_phys_to_page(phys);
> > - for (i = 0; i < nr_pages; i++)
> > + for (i = 0; i < nr_pages; i++) {
> > hyp_set_page_refcounted(&p[i]);
> > + p[i].order = 0;
> > + }
> >
> > /* Attach the unused pages to the buddy tree */
> > for (i = reserved_pages; i < nr_pages; i++)
> > --
> > 2.54.0.746.g67dd491aae-goog
> >
More information about the linux-arm-kernel
mailing list