[PATCH v5 10/10] KVM: arm64: np-guest CMOs with PMD_SIZE fixmap
Marc Zyngier
maz at kernel.org
Wed May 21 05:04:21 PDT 2025
On Wed, 21 May 2025 12:43:08 +0100,
Vincent Donnefort <vdonnefort at google.com> wrote:
>
> On Wed, May 21, 2025 at 12:01:26PM +0100, Marc Zyngier wrote:
> > On Tue, 20 May 2025 09:52:01 +0100,
> > Vincent Donnefort <vdonnefort at google.com> wrote:
> > >
> > > With the introduction of stage-2 huge mappings in the pKVM hypervisor,
> > > guest pages CMO is needed for PMD_SIZE size. Fixmap only supports
> > > PAGE_SIZE and iterating over the huge-page is time consuming (mostly due
> > > to TLBI on hyp_fixmap_unmap) which is a problem for EL2 latency.
> > >
> > > Introduce a shared PMD_SIZE fixmap (hyp_fixblock_map/hyp_fixblock_unmap)
> > > to improve guest page CMOs when stage-2 huge mappings are installed.
> > >
> > > On a Pixel6, the iterative solution resulted in a latency of ~700us,
> > > while the PMD_SIZE fixmap reduces it to ~100us.
> > >
> > > Because of the horrendous private range allocation that would be
> > > necessary, this is disabled for 64KiB pages systems.
> > >
> > > Suggested-by: Quentin Perret <qperret at google.com>
> > > Signed-off-by: Vincent Donnefort <vdonnefort at google.com>
> > > Signed-off-by: Quentin Perret <qperret at google.com>
> > >
> > > diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h
> > > index 1b43bcd2a679..2888b5d03757 100644
> > > --- a/arch/arm64/include/asm/kvm_pgtable.h
> > > +++ b/arch/arm64/include/asm/kvm_pgtable.h
> > > @@ -59,6 +59,11 @@ typedef u64 kvm_pte_t;
> > >
> > > #define KVM_PHYS_INVALID (-1ULL)
> > >
> > > +#define KVM_PTE_TYPE BIT(1)
> > > +#define KVM_PTE_TYPE_BLOCK 0
> > > +#define KVM_PTE_TYPE_PAGE 1
> > > +#define KVM_PTE_TYPE_TABLE 1
> > > +
> > > #define KVM_PTE_LEAF_ATTR_LO GENMASK(11, 2)
> > >
> > > #define KVM_PTE_LEAF_ATTR_LO_S1_ATTRIDX GENMASK(4, 2)
> > > diff --git a/arch/arm64/kvm/hyp/include/nvhe/mm.h b/arch/arm64/kvm/hyp/include/nvhe/mm.h
> > > index 230e4f2527de..6e83ce35c2f2 100644
> > > --- a/arch/arm64/kvm/hyp/include/nvhe/mm.h
> > > +++ b/arch/arm64/kvm/hyp/include/nvhe/mm.h
> > > @@ -13,9 +13,11 @@
> > > extern struct kvm_pgtable pkvm_pgtable;
> > > extern hyp_spinlock_t pkvm_pgd_lock;
> > >
> > > -int hyp_create_pcpu_fixmap(void);
> > > +int hyp_create_fixmap(void);
> > > void *hyp_fixmap_map(phys_addr_t phys);
> > > void hyp_fixmap_unmap(void);
> > > +void *hyp_fixblock_map(phys_addr_t phys, size_t *size);
> > > +void hyp_fixblock_unmap(void);
> > >
> > > int hyp_create_idmap(u32 hyp_va_bits);
> > > int hyp_map_vectors(void);
> > > diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
> > > index 1490820b9ebe..962948534179 100644
> > > --- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c
> > > +++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
> > > @@ -216,34 +216,42 @@ static void guest_s2_put_page(void *addr)
> > > hyp_put_page(¤t_vm->pool, addr);
> > > }
> > >
> > > -static void clean_dcache_guest_page(void *va, size_t size)
> > > +static void __apply_guest_page(void *va, size_t size,
> > > + void (*func)(void *addr, size_t size))
> > > {
> > > size += va - PTR_ALIGN_DOWN(va, PAGE_SIZE);
> > > va = PTR_ALIGN_DOWN(va, PAGE_SIZE);
> > > size = PAGE_ALIGN(size);
> > >
> > > while (size) {
> > > - __clean_dcache_guest_page(hyp_fixmap_map(__hyp_pa(va)),
> > > - PAGE_SIZE);
> > > - hyp_fixmap_unmap();
> > > - va += PAGE_SIZE;
> > > - size -= PAGE_SIZE;
> > > + size_t map_size = PAGE_SIZE;
> > > + void *map;
> > > +
> > > + if (size >= PMD_SIZE)
> > > + map = hyp_fixblock_map(__hyp_pa(va), &map_size);
> >
> > You seem to consider that if size if PMD_SIZE (or more), then va must
> > be PMD aligned. I don't think this is correct.
> >
> > Such an iterator should start by doing PAGE_SIZEd operations until va
> > is PMD-aligned. Only at this point can it perform PMD_SIZEd
> > operations, until the remaining size is less than PMD_SIZE. And at
> > that point, it's PAGE_SIZE all over again until the end.
>
> Arg yes you're right :-\
>
> Shall I respin a v6 with that fix or shall I wait a bit more?
Please send a new version ASAP, as I'm really getting very close to
locking down the tree (and I keep finding embarrassing bugs...).
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
More information about the linux-arm-kernel
mailing list