[PATCH v2 4/4] KVM: arm64: Create each pKVM hyp vcpu after its corresponding host vcpu
Fuad Tabba
tabba at google.com
Thu Feb 27 04:47:16 PST 2025
Hi Marc,
On Thu, 27 Feb 2025 at 04:09, Marc Zyngier <maz at kernel.org> wrote:
>
> On Wed, 26 Feb 2025 21:55:20 +0000,
> Fuad Tabba <tabba at google.com> wrote:
> >
> > Instead of creating and initializing _all_ hyp vcpus in pKVM when
> > the first host vcpu runs for the first time, initialize _each_
> > hyp vcpu in conjunction with its corresponding host vcpu.
> >
> > Some of the host vcpu state (e.g., system registers and traps
> > values) is not initialized until the first time the host vcpu is
> > run. Therefore, initializing a hyp vcpu before its corresponding
> > host vcpu has run for the first time might not view the complete
> > host state of these vcpus.
> >
> > Additionally, this behavior is inline with non-protected modes.
> >
> > Signed-off-by: Fuad Tabba <tabba at google.com>
> > ---
> > arch/arm64/include/asm/kvm_host.h | 2 +
> > arch/arm64/include/asm/kvm_pkvm.h | 1 +
> > arch/arm64/kvm/arm.c | 4 ++
> > arch/arm64/kvm/hyp/include/nvhe/pkvm.h | 6 ---
> > arch/arm64/kvm/hyp/nvhe/pkvm.c | 54 +++++++++++++++-----------
> > arch/arm64/kvm/pkvm.c | 28 ++++++-------
> > 6 files changed, 53 insertions(+), 42 deletions(-)
> >
> > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> > index 3a7ec98ef123..4a0e522f6001 100644
> > --- a/arch/arm64/include/asm/kvm_host.h
> > +++ b/arch/arm64/include/asm/kvm_host.h
> > @@ -869,6 +869,8 @@ struct kvm_vcpu_arch {
> > #define VCPU_INITIALIZED __vcpu_single_flag(cflags, BIT(0))
> > /* SVE config completed */
> > #define VCPU_SVE_FINALIZED __vcpu_single_flag(cflags, BIT(1))
> > +/* pKVM VCPU setup completed */
> > +#define VCPU_PKVM_FINALIZED __vcpu_single_flag(cflags, BIT(2))
> >
> > /* Exception pending */
> > #define PENDING_EXCEPTION __vcpu_single_flag(iflags, BIT(0))
> > diff --git a/arch/arm64/include/asm/kvm_pkvm.h b/arch/arm64/include/asm/kvm_pkvm.h
> > index eb65f12e81d9..abd693ce5b93 100644
> > --- a/arch/arm64/include/asm/kvm_pkvm.h
> > +++ b/arch/arm64/include/asm/kvm_pkvm.h
> > @@ -19,6 +19,7 @@
> > int pkvm_init_host_vm(struct kvm *kvm);
> > int pkvm_create_hyp_vm(struct kvm *kvm);
> > void pkvm_destroy_hyp_vm(struct kvm *kvm);
> > +int pkvm_create_hyp_vcpu(struct kvm_vcpu *vcpu);
> >
> > /*
> > * This functions as an allow-list of protected VM capabilities.
> > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> > index b8e55a441282..cd7a808eeb64 100644
> > --- a/arch/arm64/kvm/arm.c
> > +++ b/arch/arm64/kvm/arm.c
> > @@ -833,6 +833,10 @@ int kvm_arch_vcpu_run_pid_change(struct kvm_vcpu *vcpu)
> > ret = pkvm_create_hyp_vm(kvm);
> > if (ret)
> > return ret;
> > +
> > + ret = pkvm_create_hyp_vcpu(vcpu);
> > + if (ret)
> > + return ret;
> > }
> >
> > mutex_lock(&kvm->arch.config_lock);
> > diff --git a/arch/arm64/kvm/hyp/include/nvhe/pkvm.h b/arch/arm64/kvm/hyp/include/nvhe/pkvm.h
> > index e42bf68c8848..ce31d3b73603 100644
> > --- a/arch/arm64/kvm/hyp/include/nvhe/pkvm.h
> > +++ b/arch/arm64/kvm/hyp/include/nvhe/pkvm.h
> > @@ -43,12 +43,6 @@ struct pkvm_hyp_vm {
> > struct hyp_pool pool;
> > hyp_spinlock_t lock;
> >
> > - /*
> > - * The number of vcpus initialized and ready to run.
> > - * Modifying this is protected by 'vm_table_lock'.
> > - */
> > - unsigned int nr_vcpus;
> > -
> > /* Array of the hyp vCPU structures for this VM. */
> > struct pkvm_hyp_vcpu *vcpus[];
> > };
> > diff --git a/arch/arm64/kvm/hyp/nvhe/pkvm.c b/arch/arm64/kvm/hyp/nvhe/pkvm.c
> > index 6efb9bf56180..4ef3748dc660 100644
> > --- a/arch/arm64/kvm/hyp/nvhe/pkvm.c
> > +++ b/arch/arm64/kvm/hyp/nvhe/pkvm.c
> > @@ -245,10 +245,12 @@ struct pkvm_hyp_vcpu *pkvm_load_hyp_vcpu(pkvm_handle_t handle,
> >
> > hyp_spin_lock(&vm_table_lock);
> > hyp_vm = get_vm_by_handle(handle);
> > - if (!hyp_vm || hyp_vm->nr_vcpus <= vcpu_idx)
> > + if (!hyp_vm || hyp_vm->kvm.created_vcpus <= vcpu_idx)
>
> Why is it reasonable to trust kvm.created_vcpus here? It looks like
> something that is under direct control of the host, and that could be
> arbitrarily changed to allow dereferencing the vcpus[] array past its
> actual boundaries.
>
> What guarantees that it is stable at load-time?
>
> I have similar misgivings about the init and teardown paths.
This is the hypervisor's view of created_vcpus
(hyp_vm->kvm.created_vcpus), i.e., not the host's (not
host_kvm->created_vcpus). We read it once from the host at
__pkvm_init_vm(). We should know that it wouldn't overflow the vcpus
array since it's the value of created_vcpus (nr_vcpus) we use when we
calculate the vm_size in __pkvm_init_vm(), and try to map the
corresponding amount of donated memory for the hyp_vm (all in
__pkvm_init_vm()).
Since it's all hyp values, it should be fine, right?
Cheers,
/fuad
> Thanks,
>
> M.
>
> --
> Without deviation from the norm, progress is not possible.
More information about the linux-arm-kernel
mailing list