[PATCH v2 3/4] KVM: arm64: Factor out pKVM hyp vcpu creation to separate function

Will Deacon will at kernel.org
Mon Mar 3 11:18:13 PST 2025


On Mon, Mar 03, 2025 at 07:57:00AM +0000, Fuad Tabba wrote:
> On Fri, 28 Feb 2025 at 19:44, Quentin Perret <qperret at google.com> wrote:
> >
> > On Wednesday 26 Feb 2025 at 21:55:19 (+0000), Fuad Tabba wrote:
> > >  static int __pkvm_create_hyp_vm(struct kvm *host_kvm)
> > >  {
> > > -     size_t pgd_sz, hyp_vm_sz, hyp_vcpu_sz;
> > > +     size_t pgd_sz, hyp_vm_sz;
> > >       struct kvm_vcpu *host_vcpu;
> > > -     pkvm_handle_t handle;
> > >       void *pgd, *hyp_vm;
> > >       unsigned long idx;
> > >       int ret;
> > > @@ -161,33 +178,12 @@ static int __pkvm_create_hyp_vm(struct kvm *host_kvm)
> > >       if (ret < 0)
> > >               goto free_vm;
> > >
> > > -     handle = ret;
> > > +     WRITE_ONCE(host_kvm->arch.pkvm.handle, ret);
> >
> > What's the reason to make this a WRITE_ONCE? Does it mean we should
> > update the readers to be READ_ONCE()?
> 
> I don't remember the original reason, to be honest. In this case, it
> was to make it consistent with downstream code in Android. That said,
> I plan on revising all of these soon and fixing this (and related
> code) in light of Will's comment regarding potential specter gadgets:
> 
> https://lore.kernel.org/all/20250218092705.GA17030@willie-the-truck/

I'm not sure the spectre stuff changes the concurrency aspects here, so
Quentin's question presumably still stands even after that.

Looking at the Android code, this WRITE_ONCE() pairs with a READ_ONCE()
in pkvm_is_hyp_created() which is called without the config_lock held
by kvm_arch_prepare_memory_region(). However, given that
pkvm_is_hyp_created() is only testing against 0, I don't think the
_ONCE() accessors are doing anything useful in that case.

The more confusing stuff is where 'kvm->arch.pkvm.handle' is read
directly without the 'config_lock' held. The MMU notifiers look like
they can do that, so I wonder if there's a theoretical race where they
can race with first run and issue TLB invalidation with the wrong handle?
That would apply equally to the upstream code, I think.

Will



More information about the linux-arm-kernel mailing list