[RFC PATCH v3 08/16] KVM: arm64: Add a new VCPU device control group for SPE

Thu Nov 19 11:58:38 EST 2020

Hi Alex,

On 27/10/2020 17:26, Alexandru Elisei wrote:
> From: Sudeep Holla <sudeep.holla at arm.com>
> 
> To configure the virtual SPE buffer management interrupt number, we use a
> VCPU kvm_device ioctl, encapsulating the KVM_ARM_VCPU_SPE_IRQ attribute
> within the KVM_ARM_VCPU_SPE_CTRL group.
> 
> After configuring the SPE, userspace is required to call the VCPU ioctl
> with the attribute KVM_ARM_VCPU_SPE_INIT to initialize SPE on the VCPU.

> diff --git a/Documentation/virt/kvm/devices/vcpu.rst b/Documentation/virt/kvm/devices/vcpu.rst
> index 2acec3b9ef65..6135b9827fbe 100644
> --- a/Documentation/virt/kvm/devices/vcpu.rst
> +++ b/Documentation/virt/kvm/devices/vcpu.rst
> @@ -161,3 +161,43 @@ Specifies the base address of the stolen time structure for this VCPU. The
>  base address must be 64 byte aligned and exist within a valid guest memory
>  region. See Documentation/virt/kvm/arm/pvtime.rst for more information
>  including the layout of the stolen time structure.
> +
> +4. GROUP: KVM_ARM_VCPU_SPE_CTRL
> +===============================
> +
> +:Architectures: ARM64
> +
> +4.1 ATTRIBUTE: KVM_ARM_VCPU_SPE_IRQ
> +-----------------------------------
> +
> +:Parameters: in kvm_device_attr.addr the address for the SPE buffer management
> +             interrupt is a pointer to an int
> +
> +Returns:
> +
> +	 =======  ========================================================
> +	 -EBUSY   The SPE buffer management interrupt is already set
> +	 -EINVAL  Invalid SPE overflow interrupt number
> +	 -EFAULT  Could not read the buffer management interrupt number
> +	 -ENXIO   SPE not supported or not properly configured
> +	 =======  ========================================================
> +
> +A value describing the SPE (Statistical Profiling Extension) overflow interrupt
> +number for this vcpu. This interrupt should be a PPI and the interrupt type and
> +number must be same for each vcpu.
> +
> +4.2 ATTRIBUTE: KVM_ARM_VCPU_SPE_INIT
> +------------------------------------
> +
> +:Parameters: no additional parameter in kvm_device_attr.addr
> +
> +Returns:
> +
> +	 =======  ======================================================
> +	 -EBUSY   SPE already initialized
> +	 -ENODEV  GIC not initialized
> +	 -ENXIO   SPE not supported or not properly configured
> +	 =======  ======================================================

> +Request the initialization of the SPE. Must be done after initializing the
> +in-kernel irqchip and after setting the interrupt number for the VCPU.

Fantastic!

> diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
> index f32490229a4c..4dc205fa4be1 100644
> --- a/arch/arm64/kvm/reset.c
> +++ b/arch/arm64/kvm/reset.c
> @@ -87,6 +87,9 @@ int kvm_arch_vm_ioctl_check_extension(struct kvm *kvm, long ext)
>  	case KVM_CAP_ARM_PTRAUTH_GENERIC:
>  		r = system_has_full_ptr_auth();
>  		break;
> +	case KVM_CAP_ARM_SPE:
> +		r = kvm_arm_supports_spe();
> +		break;
>  	default:
>  		r = 0;
>  	}
> @@ -223,6 +226,19 @@ static int kvm_vcpu_enable_ptrauth(struct kvm_vcpu *vcpu)
>  	return 0;
>  }
>  
> +static int kvm_vcpu_enable_spe(struct kvm_vcpu *vcpu)
> +{
> +	if (!kvm_arm_supports_spe())
> +		return -EINVAL;
> +
> +	/* SPE is disabled if the PE is in AArch32 state */
> +	if (test_bit(KVM_ARM_VCPU_EL1_32BIT, vcpu->arch.features))
> +		return -EINVAL;
> +
> +	vcpu->arch.flags |= KVM_ARM64_GUEST_HAS_SPE;
> +	return 0;
> +}

VCPU-reset promotes the VMM feature into flags. How does this interact with
kvm_arm_spe_init()?

It doesn't look like this resets any state, couldn't it be done once by kvm_arm_spe_init()?

>  /**
>   * kvm_reset_vcpu - sets core registers and sys_regs to reset value
>   * @vcpu: The VCPU pointer
> @@ -274,6 +290,13 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
>  		}
>  	}
>  
> +	if (test_bit(KVM_ARM_VCPU_SPE, vcpu->arch.features)) {
> +		if (kvm_vcpu_enable_spe(vcpu)) {
> +			ret = -EINVAL;
> +			goto out;
> +		}
> +	}
> +
>  	switch (vcpu->arch.target) {
>  	default:
>  		if (test_bit(KVM_ARM_VCPU_EL1_32BIT, vcpu->arch.features)) {

> diff --git a/arch/arm64/kvm/spe.c b/arch/arm64/kvm/spe.c
> new file mode 100644
> index 000000000000..f91a52cd7cd3
> --- /dev/null
> +++ b/arch/arm64/kvm/spe.c
> @@ -0,0 +1,129 @@

> +static bool kvm_arm_spe_irq_is_valid(struct kvm *kvm, int irq)
> +{
> +	int i;
> +	struct kvm_vcpu *vcpu;
> +
> +	/* The SPE overflow interrupt can be a PPI only */
> +	if (!irq_is_ppi(irq))
> +		return false;
> +
> +	kvm_for_each_vcpu(i, vcpu, kvm) {
> +		if (!kvm_arm_spe_irq_initialized(vcpu))
> +			continue;
> +
> +		if (vcpu->arch.spe_cpu.irq_num != irq)
> +			return false;
> +	}

Looks like you didn't really want a vcpu property! (huh, patch 10 adds a vm property too)
We're making this a vcpu property because of the PPI and system registers? (both good reasons)

If the PPI number lived in struct kvm_arch, you'd only only need to check it was
uninitialised, or the same to get the same behaviour, which would save some of this error
handling.

> +	return true;
> +}

> diff --git a/include/kvm/arm_spe.h b/include/kvm/arm_spe.h
> index 46ec447ed013..0275e8097529 100644
> --- a/include/kvm/arm_spe.h
> +++ b/include/kvm/arm_spe.h
> @@ -18,11 +18,38 @@ struct kvm_spe_cpu {
>  	bool initialized; 	/* Feature is initialized on VCPU */
>  };
>  
> +#define kvm_arm_spe_irq_initialized(v)			\
> +	((v)->arch.spe_cpu.irq_num >= VGIC_NR_SGIS &&	\
> +	 (v)->arch.spe_cpu.irq_num < VGIC_MAX_PRIVATE)

Didn't GICv(mumbles) add an additional PPI range? Could this be made irq_is_ppi(), that
way if the vgic gains support for that, we don't get weird behaviour here?

Thanks,

James