[PATCH v4 17/20] KVM: x86/mmu: Zap collapsible SPTEs at all levels in the shadow MMU

Mon May 9 09:31:46 PDT 2022

Maybe a slight tweak to the shortlog?  "Zap collapsible SPTEs at all levels in
the shadow MMU" left me wondering "when is KVM zapping at all levels?"

  KVM: x86/mmu: Zap all possible levels in shadow MMU when collapsing SPTEs

On Fri, Apr 22, 2022, David Matlack wrote:
> Currently KVM only zaps collapsible 4KiB SPTEs in the shadow MMU (i.e.
> in the rmap). This is fine for now KVM never creates intermediate huge
> pages during dirty logging, i.e. a 1GiB page is never partially split to
> a 2MiB page.

"partially" is really confusing.  I think what you mean is that KVM can split a
1gb to a 2mb page, and not split all the way down to 4kb.  But "partially" makes
it sound like KVM ends up with a huge SPTE that is half split or something.  I
think you can just avoid that altogether and be more explicit:

  i.e. a 1GiB pager is never split to just 2MiB, dirty logging always splits
  down to 4KiB pages.

> However, this will stop being true once the shadow MMU participates in
> eager page splitting, which can in fact leave behind partially split

"partially" again.  Maybe

  which can in fact leave behind 2MiB pages after splitting 1GiB huge pages.

> huge pages. In preparation for that change, change the shadow MMU to
> iterate over all necessary levels when zapping collapsible SPTEs.
> 
> No functional change intended.
> 
> Reviewed-by: Peter Xu <peterx at redhat.com>
> Signed-off-by: David Matlack <dmatlack at google.com>
> ---
>  arch/x86/kvm/mmu/mmu.c | 21 ++++++++++++++-------
>  1 file changed, 14 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index ed65899d15a2..479c581e8a96 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -6098,18 +6098,25 @@ static bool kvm_mmu_zap_collapsible_spte(struct kvm *kvm,
>  	return need_tlb_flush;
>  }
>  
> +static void kvm_rmap_zap_collapsible_sptes(struct kvm *kvm,
> +					   const struct kvm_memory_slot *slot)
> +{
> +	/*
> +	 * Note, use KVM_MAX_HUGEPAGE_LEVEL - 1 since there's no need to zap
> +	 * pages that are already mapped at the maximum possible level.
> +	 */
> +	if (slot_handle_level(kvm, slot, kvm_mmu_zap_collapsible_spte,
> +			      PG_LEVEL_4K, KVM_MAX_HUGEPAGE_LEVEL - 1,
> +			      true))
> +		kvm_arch_flush_remote_tlbs_memslot(kvm, slot);
> +}
> +
>  void kvm_mmu_zap_collapsible_sptes(struct kvm *kvm,
>  				   const struct kvm_memory_slot *slot)
>  {
>  	if (kvm_memslots_have_rmaps(kvm)) {
>  		write_lock(&kvm->mmu_lock);
> -		/*
> -		 * Zap only 4k SPTEs since the legacy MMU only supports dirty
> -		 * logging at a 4k granularity and never creates collapsible
> -		 * 2m SPTEs during dirty logging.
> -		 */
> -		if (slot_handle_level_4k(kvm, slot, kvm_mmu_zap_collapsible_spte, true))
> -			kvm_arch_flush_remote_tlbs_memslot(kvm, slot);
> +		kvm_rmap_zap_collapsible_sptes(kvm, slot);
>  		write_unlock(&kvm->mmu_lock);
>  	}
>  
> -- 
> 2.36.0.rc2.479.g8af0fa9b8e-goog
>