[PATCH 17/30] KVM: arm64: Generalise kvm_pgtable_stage2_set_owner()

Will Deacon will at kernel.org
Fri Jan 16 16:03:59 PST 2026


On Fri, Jan 09, 2026 at 06:46:04PM +0000, Will Deacon wrote:
> On Tue, Jan 06, 2026 at 03:20:15PM +0000, Quentin Perret wrote:
> > On Monday 05 Jan 2026 at 15:49:25 (+0000), Will Deacon wrote:
> > >  /**
> > > - * kvm_pgtable_stage2_set_owner() - Unmap and annotate pages in the IPA space to
> > > - *				    track ownership.
> > > + * kvm_pgtable_stage2_annotate() - Unmap and annotate pages in the IPA space
> > > + *				   to track ownership (and more).
> > >   * @pgt:	Page-table structure initialised by kvm_pgtable_stage2_init*().
> > >   * @addr:	Base intermediate physical address to annotate.
> > >   * @size:	Size of the annotated range.
> > >   * @mc:		Cache of pre-allocated and zeroed memory from which to allocate
> > >   *		page-table pages.
> > > - * @owner_id:	Unique identifier for the owner of the page.
> > > + * @annotation:	A 62-bit value that will be stored in the page tables.
> > > + *		@annotation[0] and @annotation[63] must be 0.
> > > + * 		@annotation[62:1] is stored in the page tables.
> > >   *
> > >   * By default, all page-tables are owned by identifier 0. This function can be
> > >   * used to mark portions of the IPA space as owned by other entities. When a
> > > @@ -673,8 +678,8 @@ int kvm_pgtable_stage2_map(struct kvm_pgtable *pgt, u64 addr, u64 size,
> > >   *
> > >   * Return: 0 on success, negative error code on failure.
> > >   */
> > > -int kvm_pgtable_stage2_set_owner(struct kvm_pgtable *pgt, u64 addr, u64 size,
> > > -				 void *mc, u8 owner_id);
> > > +int kvm_pgtable_stage2_annotate(struct kvm_pgtable *pgt, u64 addr, u64 size,
> > > +				void *mc, kvm_pte_t annotation);
> > 
> > While we're on this topic, perhaps we could go one step further and 'type'
> > the annotation itself? For instance have a 'type' and 'meta' parameter
> > directly at the kvm_pgatble_stage2_annotate() level instead of leaving
> > that up to the callers. This would allow to have one place to allocate
> > annotation 'types' (donated pages, locked PTE, MMIO guard, ...) and one
> > way to serialize/deserialize them. That 'type' would be stored in top 2
> > or 3 bits of the PTE for instance, and decoding of the 'meta' field would
> > be dependant on the type value. Thoughts?
> 
> I don't think a global 'type' space is particularly beneficial, as most
> annotations (with the exception of PTE_LOCKED) are specific to the owner
> and putting them into a single number space will just waste bits.
> 
> But I do like the idea of encoding an annotation type in the pte and
> defining those per-owner. I think it would also make some of the code
> more robust; for example, I noticed that __pkvm_guest_unshare_host()
> isn't putting back the right annotation with my series when I started
> looking at implementing your idea.
> 
> I'll come back with a diff. It won't be quite what you're suggesting,
> but let's see what you think.

Ok, so this took a fair bit longer than I initially thought it would.

In the end, I've ended up with something closer to what you seem to have
in mind: kvm_pgtable_stage2_annotate() now takes both a 'type' and an
annotation but it's not without complexity:

  * We're extremely short on bits when endoding the guest gfn + handle.
    The gfn can be 40 bits with LPA2 and 4k pages, so with the 16-bit
    handle and a 4-bit type, that leaves only 3 bits for the owner.

  * Non-zero invalid ptes are 'counted', so I've had to adjust the
    reclaim path to map pages back into the host immediately rather than
    go via a typed annotation.

Despite that, I think it's a slight improvement for now. We may end up
revisiting it if we want to squeeze more bits out of the host
annotations, but I'll include it in v2 for you to have a look at.

Will



More information about the linux-arm-kernel mailing list