[PATCH v2 12/24] KVM: arm64: Introduce shadow VM state at EL2

Wed Jul 20 11:20:05 PDT 2022

Hi Vincent,

Thanks for going through this.

On Mon, Jul 18, 2022 at 07:40:05PM +0100, Vincent Donnefort wrote:
> [...]
> 
> > diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h
> > index 9f339dffbc1a..2d6b5058f7d3 100644
> > --- a/arch/arm64/include/asm/kvm_pgtable.h
> > +++ b/arch/arm64/include/asm/kvm_pgtable.h
> > @@ -288,6 +288,14 @@ u64 kvm_pgtable_hyp_unmap(struct kvm_pgtable *pgt, u64 addr, u64 size);
> >   */
> >  u64 kvm_get_vtcr(u64 mmfr0, u64 mmfr1, u32 phys_shift);
> >  
> > +/*
> 
> /** ?
> 
> > + * kvm_pgtable_stage2_pgd_size() - Helper to compute size of a stage-2 PGD
> > + * @vtcr:	Content of the VTCR register.
> > + *
> > + * Return: the size (in bytes) of the stage-2 PGD
> > + */

I'll also check this is valid kernel-doc before adding the new comment
syntax!

> > +/*
> > + * Holds the relevant data for maintaining the vcpu state completely at hyp.
> > + */
> > +struct kvm_shadow_vcpu_state {
> > +	/* The data for the shadow vcpu. */
> > +	struct kvm_vcpu shadow_vcpu;
> > +
> > +	/* A pointer to the host's vcpu. */
> > +	struct kvm_vcpu *host_vcpu;
> > +
> > +	/* A pointer to the shadow vm. */
> > +	struct kvm_shadow_vm *shadow_vm;
> 
> IMHO, those declarations are already self-explanatory. The comments above don't
> bring much.

Agreed, and Sean has ideas to rework bits of this as well. I'll drop the
comments.

> > diff --git a/arch/arm64/kvm/hyp/nvhe/pkvm.c b/arch/arm64/kvm/hyp/nvhe/pkvm.c
> > index 99c8d8b73e70..77aeb787670b 100644
> > --- a/arch/arm64/kvm/hyp/nvhe/pkvm.c
> > +++ b/arch/arm64/kvm/hyp/nvhe/pkvm.c
> > @@ -7,6 +7,9 @@
> >  #include <linux/kvm_host.h>
> >  #include <linux/mm.h>
> >  #include <nvhe/fixed_config.h>
> > +#include <nvhe/mem_protect.h>
> > +#include <nvhe/memory.h>
> 
> I don't think this one is necessary, it is already included in mm.h.

I thought it was generally bad form to rely on transitive includes, as it
makes header rework even more painful than it already is.

> > +static void unpin_host_vcpus(struct kvm_shadow_vcpu_state *shadow_vcpu_states,
> > +			     unsigned int nr_vcpus)
> > +{
> > +	int i;
> > +
> > +	for (i = 0; i < nr_vcpus; i++) {
> > +		struct kvm_vcpu *host_vcpu = shadow_vcpu_states[i].host_vcpu;
> 
> IIRC, checkpatch likes an empty line after declarations.

We can fix that!

> > +static unsigned int insert_shadow_table(struct kvm *kvm,
> > +					struct kvm_shadow_vm *vm,
> > +					size_t shadow_size)
> > +{
> > +	struct kvm_s2_mmu *mmu = &vm->kvm.arch.mmu;
> > +	unsigned int shadow_handle;
> > +	unsigned int vmid;
> > +
> > +	hyp_assert_lock_held(&shadow_lock);
> > +
> > +	if (unlikely(nr_shadow_entries >= KVM_MAX_PVMS))
> > +		return -ENOMEM;
> > +
> > +	/*
> > +	 * Initializing protected state might have failed, yet a malicious host
> > +	 * could trigger this function. Thus, ensure that shadow_table exists.
> > +	 */
> > +	if (unlikely(!shadow_table))
> > +		return -EINVAL;
> > +
> > +	/* Check that a shadow hasn't been created before for this host KVM. */
> > +	if (unlikely(__exists_shadow(kvm)))
> > +		return -EEXIST;
> > +
> > +	/* Find the next free entry in the shadow table. */
> > +	while (shadow_table[next_shadow_alloc])
> > +		next_shadow_alloc = (next_shadow_alloc + 1) % KVM_MAX_PVMS;
> 
> Couldn't it be merged with __exists_shadow which already knows the first free
> shadow_table idx?

Good idea, that would save us going through it twice.

> 
> > +	shadow_handle = idx_to_shadow_handle(next_shadow_alloc);
> > +
> > +	vm->kvm.arch.pkvm.shadow_handle = shadow_handle;
> > +	vm->shadow_area_size = shadow_size;
> > +
> > +	/* VMID 0 is reserved for the host */
> > +	vmid = next_shadow_alloc + 1;
> > +	if (vmid > 0xff)
> 
> Couldn't the 0xff be found with get_vmid_bits() or even from host_kvm.arch.vtcr?
> Or does that depends on something completely different?
> 
> Also, appologies if this has been discussed already and I missed it, maybe
> KVM_MAX_PVMS could be changed for that value - 1. Unless we think that archs
> supporting 16 bits would waste way too much memory for that?

We should probably clamp the VMID based on KVM_MAX_PVMS here, as although
some CPUs support 16-bit VMIDs, we don't currently support that with pKVM.
I'll make that change to avoid hard-coding the constant here.

Thanks!

Will