[PATCH v4 11/20] KVM: x86/mmu: Allow for NULL vcpu pointer in __kvm_mmu_get_shadow_page()

Mon May 9 14:26:29 PDT 2022

On Thu, May 5, 2022 at 4:33 PM Sean Christopherson <seanjc at google.com> wrote:
>
> On Fri, Apr 22, 2022, David Matlack wrote:
> > Allow the vcpu pointer in __kvm_mmu_get_shadow_page() to be NULL. Rename
> > it to vcpu_or_null to prevent future commits from accidentally taking
> > dependency on it without first considering the NULL case.
> >
> > The vcpu pointer is only used for syncing indirect shadow pages in
> > kvm_mmu_find_shadow_page(). A vcpu pointer it not required for
> > correctness since unsync pages can simply be zapped. But this should
> > never occur in practice, since the only use-case for passing a NULL vCPU
> > pointer is eager page splitting which will only request direct shadow
> > pages (which can never be unsync).
> >
> > Even though __kvm_mmu_get_shadow_page() can gracefully handle a NULL
> > vcpu, add a WARN() that will fire if __kvm_mmu_get_shadow_page() is ever
> > called to get an indirect shadow page with a NULL vCPU pointer, since
> > zapping unsync SPs is a performance overhead that should be considered.
> >
> > Signed-off-by: David Matlack <dmatlack at google.com>
> > ---
> >  arch/x86/kvm/mmu/mmu.c | 40 ++++++++++++++++++++++++++++++++--------
> >  1 file changed, 32 insertions(+), 8 deletions(-)
> >
> > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> > index 04029c01aebd..21407bd4435a 100644
> > --- a/arch/x86/kvm/mmu/mmu.c
> > +++ b/arch/x86/kvm/mmu/mmu.c
> > @@ -1845,16 +1845,27 @@ static void kvm_mmu_commit_zap_page(struct kvm *kvm,
> >         &(_kvm)->arch.mmu_page_hash[kvm_page_table_hashfn(_gfn)])     \
> >               if ((_sp)->gfn != (_gfn) || (_sp)->role.direct) {} else
> >
> > -static int kvm_sync_page(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp,
> > -                      struct list_head *invalid_list)
> > +static int __kvm_sync_page(struct kvm *kvm, struct kvm_vcpu *vcpu_or_null,
> > +                        struct kvm_mmu_page *sp,
> > +                        struct list_head *invalid_list)
> >  {
> > -     int ret = vcpu->arch.mmu->sync_page(vcpu, sp);
> > +     int ret = -1;
> > +
> > +     if (vcpu_or_null)
>
> This should never happen.  I like the idea of warning early, but I really don't
> like that the WARN is far removed from the code that actually depends on @vcpu
> being non-NULL. Case in point, KVM should have bailed on the WARN and never
> reached this point.  And the inner __kvm_sync_page() is completely unnecessary.

Yeah that's fair.

>
> I also don't love the vcpu_or_null terminology; I get the intent, but it doesn't
> really help because understand why/when it's NULL.

Eh, I don't think it needs to encode why or when. It just needs to
flag to the reader (and future code authors) that this vcpu pointer
(unlike all other vcpu pointers in KVM) is NULL in certain cases.

>
> I played around with casting, e.g. to/from an unsigned long or void *, to prevent
> usage, but that doesn't work very well because 'unsigned long' ends up being
> awkward/confusing, and 'void *' is easily lost on a function call.  And both
> lose type safety :-(

Yet another shortcoming of C :(

(The other being our other discussion about the RET_PF* return codes
getting easily misinterpreted as KVM's magic return-to-user /
continue-running-guest return codes.)

Makes me miss Rust!

>
> All in all, I think I'd prefer this patch to simply be a KVM_BUG_ON() if
> kvm_mmu_find_shadow_page() encounters an unsync page.  Less churn, and IMO there's
> no real loss in robustness, e.g. we'd really have to screw up code review and
> testing to introduce a null vCPU pointer dereference in this code.

Agreed about moving the check here and dropping __kvm_sync_page(). But
I would prefer to retain the vcpu_or_null name (or at least something
other than "vcpu" to indicate there's something non-standard about
this pointer).

>
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 3d102522804a..5aed9265f592 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -2041,6 +2041,13 @@ static struct kvm_mmu_page *kvm_mmu_find_shadow_page(struct kvm *kvm,
>                         goto out;
>
>                 if (sp->unsync) {
> +                       /*
> +                        * Getting indirect shadow pages without a vCPU pointer
> +                        * is not supported, i.e. this should never happen.
> +                        */
> +                       if (KVM_BUG_ON(!vcpu, kvm))
> +                               break;
> +
>                         /*
>                          * The page is good, but is stale.  kvm_sync_page does
>                          * get the latest guest state, but (unlike mmu_unsync_children)
>