[PATCH v5 10/10] KVM: arm64: np-guest CMOs with PMD_SIZE fixmap

Wed May 21 05:04:21 PDT 2025

On Wed, 21 May 2025 12:43:08 +0100,
Vincent Donnefort <vdonnefort at google.com> wrote:
> 
> On Wed, May 21, 2025 at 12:01:26PM +0100, Marc Zyngier wrote:
> > On Tue, 20 May 2025 09:52:01 +0100,
> > Vincent Donnefort <vdonnefort at google.com> wrote:
> > > 
> > > With the introduction of stage-2 huge mappings in the pKVM hypervisor,
> > > guest pages CMO is needed for PMD_SIZE size. Fixmap only supports
> > > PAGE_SIZE and iterating over the huge-page is time consuming (mostly due
> > > to TLBI on hyp_fixmap_unmap) which is a problem for EL2 latency.
> > > 
> > > Introduce a shared PMD_SIZE fixmap (hyp_fixblock_map/hyp_fixblock_unmap)
> > > to improve guest page CMOs when stage-2 huge mappings are installed.
> > > 
> > > On a Pixel6, the iterative solution resulted in a latency of ~700us,
> > > while the PMD_SIZE fixmap reduces it to ~100us.
> > > 
> > > Because of the horrendous private range allocation that would be
> > > necessary, this is disabled for 64KiB pages systems.
> > > 
> > > Suggested-by: Quentin Perret <qperret at google.com>
> > > Signed-off-by: Vincent Donnefort <vdonnefort at google.com>
> > > Signed-off-by: Quentin Perret <qperret at google.com>
> > > 
> > > diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h
> > > index 1b43bcd2a679..2888b5d03757 100644
> > > --- a/arch/arm64/include/asm/kvm_pgtable.h
> > > +++ b/arch/arm64/include/asm/kvm_pgtable.h
> > > @@ -59,6 +59,11 @@ typedef u64 kvm_pte_t;
> > >  
> > >  #define KVM_PHYS_INVALID		(-1ULL)
> > >  
> > > +#define KVM_PTE_TYPE			BIT(1)
> > > +#define KVM_PTE_TYPE_BLOCK		0
> > > +#define KVM_PTE_TYPE_PAGE		1
> > > +#define KVM_PTE_TYPE_TABLE		1
> > > +
> > >  #define KVM_PTE_LEAF_ATTR_LO		GENMASK(11, 2)
> > >  
> > >  #define KVM_PTE_LEAF_ATTR_LO_S1_ATTRIDX	GENMASK(4, 2)
> > > diff --git a/arch/arm64/kvm/hyp/include/nvhe/mm.h b/arch/arm64/kvm/hyp/include/nvhe/mm.h
> > > index 230e4f2527de..6e83ce35c2f2 100644
> > > --- a/arch/arm64/kvm/hyp/include/nvhe/mm.h
> > > +++ b/arch/arm64/kvm/hyp/include/nvhe/mm.h
> > > @@ -13,9 +13,11 @@
> > >  extern struct kvm_pgtable pkvm_pgtable;
> > >  extern hyp_spinlock_t pkvm_pgd_lock;
> > >  
> > > -int hyp_create_pcpu_fixmap(void);
> > > +int hyp_create_fixmap(void);
> > >  void *hyp_fixmap_map(phys_addr_t phys);
> > >  void hyp_fixmap_unmap(void);
> > > +void *hyp_fixblock_map(phys_addr_t phys, size_t *size);
> > > +void hyp_fixblock_unmap(void);
> > >  
> > >  int hyp_create_idmap(u32 hyp_va_bits);
> > >  int hyp_map_vectors(void);
> > > diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
> > > index 1490820b9ebe..962948534179 100644
> > > --- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c
> > > +++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
> > > @@ -216,34 +216,42 @@ static void guest_s2_put_page(void *addr)
> > >  	hyp_put_page(&current_vm->pool, addr);
> > >  }
> > >  
> > > -static void clean_dcache_guest_page(void *va, size_t size)
> > > +static void __apply_guest_page(void *va, size_t size,
> > > +			       void (*func)(void *addr, size_t size))
> > >  {
> > >  	size += va - PTR_ALIGN_DOWN(va, PAGE_SIZE);
> > >  	va = PTR_ALIGN_DOWN(va, PAGE_SIZE);
> > >  	size = PAGE_ALIGN(size);
> > >  
> > >  	while (size) {
> > > -		__clean_dcache_guest_page(hyp_fixmap_map(__hyp_pa(va)),
> > > -					  PAGE_SIZE);
> > > -		hyp_fixmap_unmap();
> > > -		va += PAGE_SIZE;
> > > -		size -= PAGE_SIZE;
> > > +		size_t map_size = PAGE_SIZE;
> > > +		void *map;
> > > +
> > > +		if (size >= PMD_SIZE)
> > > +			map = hyp_fixblock_map(__hyp_pa(va), &map_size);
> > 
> > You seem to consider that if size if PMD_SIZE (or more), then va must
> > be PMD aligned. I don't think this is correct.
> > 
> > Such an iterator should start by doing PAGE_SIZEd operations until va
> > is PMD-aligned. Only at this point can it perform PMD_SIZEd
> > operations, until the remaining size is less than PMD_SIZE. And at
> > that point, it's PAGE_SIZE all over again until the end.
> 
> Arg yes you're right :-\ 
> 
> Shall I respin a v6 with that fix or shall I wait a bit more?

Please send a new version ASAP, as I'm really getting very close to
locking down the tree (and I keep finding embarrassing bugs...).

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.