[PATCH v3 2/3] KVM: arm64: mixed-width check should be skipped for uninitialized vCPUs
Reiji Watanabe
reijiw at google.com
Fri Mar 4 00:00:20 PST 2022
Hi Marc,
On Thu, Mar 3, 2022 at 8:11 AM Marc Zyngier <maz at kernel.org> wrote:
>
> Reiji,
>
> Please add a cover letter to your patches. It actually is important to
> track the changes as well as being an anchor in my email client.
Sure, I will add a cover letter for v4.
> On Thu, 03 Mar 2022 03:54:07 +0000,
> Reiji Watanabe <reijiw at google.com> wrote:
> >
> > KVM allows userspace to configure either all EL1 32bit or 64bit vCPUs
> > for a guest. At vCPU reset, vcpu_allowed_register_width() checks
> > if the vcpu's register width is consistent with all other vCPUs'.
> > Since the checking is done even against vCPUs that are not initialized
> > (KVM_ARM_VCPU_INIT has not been done) yet, the uninitialized vCPUs
> > are erroneously treated as 64bit vCPU, which causes the function to
> > incorrectly detect a mixed-width VM.
> >
> > Introduce KVM_ARCH_FLAG_EL1_32BIT and KVM_ARCH_FLAG_REG_WIDTH_CONFIGURED
> > bits for kvm->arch.flags. A value of the EL1_32BIT bit indicates that
> > the guest needs to be configured with all 32bit or 64bit vCPUs, and
> > a value of the REG_WIDTH_CONFIGURED bit indicates if a value of the
> > EL1_32BIT bit is valid (already set up). Values in those bits are set at
> > the first KVM_ARM_VCPU_INIT for the guest based on KVM_ARM_VCPU_EL1_32BIT
> > configuration for the vCPU.
> >
> > Check vcpu's register width against those new bits at the vcpu's
> > KVM_ARM_VCPU_INIT (instead of against other vCPUs' register width).
> >
> > Fixes: 66e94d5cafd4 ("KVM: arm64: Prevent mixed-width VM creation")
> > Signed-off-by: Reiji Watanabe <reijiw at google.com>
> > ---
> > arch/arm64/include/asm/kvm_emulate.h | 25 +++++++++++------
> > arch/arm64/include/asm/kvm_host.h | 8 ++++++
> > arch/arm64/kvm/arm.c | 41 ++++++++++++++++++++++++++++
> > arch/arm64/kvm/reset.c | 8 ------
> > 4 files changed, 65 insertions(+), 17 deletions(-)
> >
> > diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
> > index d62405ce3e6d..f4f960819888 100644
> > --- a/arch/arm64/include/asm/kvm_emulate.h
> > +++ b/arch/arm64/include/asm/kvm_emulate.h
> > @@ -20,6 +20,7 @@
> > #include <asm/ptrace.h>
> > #include <asm/cputype.h>
> > #include <asm/virt.h>
> > +#include <asm/kvm_mmu.h>
>
> Huh... I wish we didn't drag that one here, it is eventually going to
> hurt...
>
> >
> > #define CURRENT_EL_SP_EL0_VECTOR 0x0
> > #define CURRENT_EL_SP_ELx_VECTOR 0x200
> > @@ -45,7 +46,14 @@ void kvm_vcpu_wfi(struct kvm_vcpu *vcpu);
> >
> > static __always_inline bool vcpu_el1_is_32bit(struct kvm_vcpu *vcpu)
> > {
> > - return !(vcpu->arch.hcr_el2 & HCR_RW);
> > + struct kvm *kvm;
> > +
> > + kvm = is_kernel_in_hyp_mode() ? kern_hyp_va(vcpu->kvm) : vcpu->kvm;
>
> Errr... On first approximation, this is the wrong way around. A VHE
> kernel doesn't need any repainting of the address, while a nVHE kernel
> does. Even more, a bit of context:
>
> static inline bool is_kernel_in_hyp_mode(void)
> {
> return read_sysreg(CurrentEL) == CurrentEL_EL2;
> }
>
> So not only the expression is the wrong way around, but it *cannot*
> distinguish VHE and nVHE when running at EL2. You're just lucky that
> the two bugs (on a single line) cancel each others.
>
> The only sane way to write this is to *not* look at the mode you're
> running in. kern_hyp_va() is designed to be nop'ed out on VHE.
Actually, I did it knowing kern_hyp_va() was nop on vhe and
kern_hyp_va() was needed for nvhe in hyp. I wanted to make the
function work whether it is nvhe hyp or non-hyp, or vhe...
> > +
> > + WARN_ON_ONCE(!test_bit(KVM_ARCH_FLAG_REG_WIDTH_CONFIGURED,
> > + &kvm->arch.flags));
> > +
> > + return test_bit(KVM_ARCH_FLAG_EL1_32BIT, &kvm->arch.flags);
> > }
>
> Given that this is used on the vcpu switch fast path at least twice
> per run, we need something better. You probably want to offer
> different primitives depending on the context:
>
> diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
> index d62405ce3e6d..daea0885c28d 100644
> --- a/arch/arm64/include/asm/kvm_emulate.h
> +++ b/arch/arm64/include/asm/kvm_emulate.h
> @@ -43,10 +43,22 @@ void kvm_inject_pabt(struct kvm_vcpu *vcpu, unsigned long addr);
>
> void kvm_vcpu_wfi(struct kvm_vcpu *vcpu);
>
> +#if defined (__KVM_VHE_HYPERVISOR__) || defined (__KVM_NVHE_HYPERVISOR__)
> static __always_inline bool vcpu_el1_is_32bit(struct kvm_vcpu *vcpu)
> {
> return !(vcpu->arch.hcr_el2 & HCR_RW);
> }
> +#else
> +static inline bool vcpu_el1_is_32bit(struct kvm_vcpu *vcpu)
> +{
> + struct kvm *kvm = kern_hyp_va(vcpu->kvm);
> +
> + WARN_ON_ONCE(!test_bit(KVM_ARCH_FLAG_REG_WIDTH_CONFIGURED,
> + &kvm->arch_flags));
> +
> + return test_bit(KVM_ARCH_FLAG_EL1_32BIT, &kvm->arch.flags);
> +}
> +#endif
>
> as you are guaranteed to have configured the width of the vcpu by the
> time you hit start messing with it in the context of the hypervisor.
Thank you for the proposal!
I will fix that based on the proposal.
> >
> > static inline void vcpu_reset_hcr(struct kvm_vcpu *vcpu)
> > @@ -72,15 +80,14 @@ static inline void vcpu_reset_hcr(struct kvm_vcpu *vcpu)
> > vcpu->arch.hcr_el2 |= HCR_TVM;
> > }
> >
> > - if (test_bit(KVM_ARM_VCPU_EL1_32BIT, vcpu->arch.features))
> > + if (vcpu_el1_is_32bit(vcpu))
> > vcpu->arch.hcr_el2 &= ~HCR_RW;
> > -
> > - /*
> > - * TID3: trap feature register accesses that we virtualise.
> > - * For now this is conditional, since no AArch32 feature regs
> > - * are currently virtualised.
> > - */
> > - if (!vcpu_el1_is_32bit(vcpu))
> > + else
> > + /*
> > + * TID3: trap feature register accesses that we virtualise.
> > + * For now this is conditional, since no AArch32 feature regs
> > + * are currently virtualised.
> > + */
> > vcpu->arch.hcr_el2 |= HCR_TID3;
> >
> > if (cpus_have_const_cap(ARM64_MISMATCHED_CACHE_TYPE) ||
> > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> > index 11a7ae747ded..5cde7f7b5042 100644
> > --- a/arch/arm64/include/asm/kvm_host.h
> > +++ b/arch/arm64/include/asm/kvm_host.h
> > @@ -125,6 +125,14 @@ struct kvm_arch {
> > #define KVM_ARCH_FLAG_RETURN_NISV_IO_ABORT_TO_USER 0
> > /* Memory Tagging Extension enabled for the guest */
> > #define KVM_ARCH_FLAG_MTE_ENABLED 1
> > + /*
> > + * The guest's EL1 register width. A value of KVM_ARCH_FLAG_EL1_32BIT
> > + * bit is valid only when KVM_ARCH_FLAG_REG_WIDTH_CONFIGURED is set.
> > + * Otherwise, the guest's EL1 register width has not yet been
> > + * determined yet.
> > + */
> > +#define KVM_ARCH_FLAG_REG_WIDTH_CONFIGURED 2
> > +#define KVM_ARCH_FLAG_EL1_32BIT 3
> > unsigned long flags;
> >
> > /*
> > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> > index 9a2d240ef6a3..9ac75aa46e2f 100644
> > --- a/arch/arm64/kvm/arm.c
> > +++ b/arch/arm64/kvm/arm.c
> > @@ -1101,6 +1101,43 @@ int kvm_vm_ioctl_irq_line(struct kvm *kvm, struct kvm_irq_level *irq_level,
> > return -EINVAL;
> > }
> >
> > +/*
> > + * A guest can have either all EL1 32bit or 64bit vcpus only. It is
> > + * indicated by a value of KVM_ARCH_FLAG_EL1_32BIT bit in kvm->arch.flags,
> > + * which is valid only when KVM_ARCH_FLAG_REG_WIDTH_CONFIGURED in
> > + * kvm->arch.flags is set.
> > + * This function checks if the vCPU's register width configuration is
> > + * consistent with a value of the EL1_32BIT bit in kvm->arch.flags
> > + * when the REG_WIDTH_CONFIGURED bit is set.
> > + * Otherwise, the function sets a value of EL1_32BIT bit based on the vcpu's
> > + * KVM_ARM_VCPU_EL1_32BIT configuration (and sets the REG_WIDTH_CONFIGURED
> > + * bit of kvm->arch.flags).
> > + */
> > +static int kvm_register_width_check_or_init(struct kvm_vcpu *vcpu)
>
> The naming is positively Java-esque! How about kvm_set_vm_width()
> instead? Also, please document the error code.
Sure, I will fix the name, and document the error code.
>
> > +{
> > + bool is32bit;
> > + bool allowed = true;
> > + struct kvm *kvm = vcpu->kvm;
> > +
> > + is32bit = vcpu_has_feature(vcpu, KVM_ARM_VCPU_EL1_32BIT);
> > +
> > + mutex_lock(&kvm->lock);
> > +
> > + if (test_bit(KVM_ARCH_FLAG_REG_WIDTH_CONFIGURED, &kvm->arch.flags)) {
> > + allowed = (is32bit ==
> > + test_bit(KVM_ARCH_FLAG_EL1_32BIT, &kvm->arch.flags));
> > + } else {
> > + if (is32bit)
> > + set_bit(KVM_ARCH_FLAG_EL1_32BIT, &kvm->arch.flags);
>
> nit: probably best written as:
>
> __assign_bit(KVM_ARCH_FLAG_EL1_32BIT, &kvm->arch.flags, is32bit);
>
> > +
> > + set_bit(KVM_ARCH_FLAG_REG_WIDTH_CONFIGURED, &kvm->arch.flags);
>
> Since this is only ever set whilst holding the lock, you can user the
> __set_bit() version.
Thank you for the proposal. But since other CPUs could attempt
to set other bits without holding the lock, I don't think we
can use the non-atomic version here.
>
> > + }
> > +
> > + mutex_unlock(&kvm->lock);
> > +
> > + return allowed ? 0 : -EINVAL;
> > +}
> > +
> > static int kvm_vcpu_set_target(struct kvm_vcpu *vcpu,
> > const struct kvm_vcpu_init *init)
> > {
> > @@ -1140,6 +1177,10 @@ static int kvm_vcpu_set_target(struct kvm_vcpu *vcpu,
> >
> > /* Now we know what it is, we can reset it. */
> > ret = kvm_reset_vcpu(vcpu);
> > +
> > + if (!ret)
> > + ret = kvm_register_width_check_or_init(vcpu);
>
> Why is that called *after* resetting the vcpu, which itself relies on
> KVM_ARM_VCPU_EL1_32BIT, which we agreed to get rid of as much as
> possible?
That's because I didn't want to set EL1_32BIT/REG_WIDTH_CONFIGURED
for the guest based on the vCPU for which KVM_ARM_VCPU_INIT would fail.
The flags can be set in the kvm_reset_vcpu() and cleared in
case of failure. But then that temporary value could lead
KVM_ARM_VCPU_INIT for other vCPUs to fail, which I don't think
is nice to do.
Another option (almost the same though) I considered was as
follows, which just had kvm_reset_vcpu() use the new flag
when available (The following is the diff from the v3 patches).
KVM_ARM_VCPU_EL1_32BIT is used anyway by kvm_reset_vcpu()
and kvm_set_vm_width() though...
---
diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
index 6c5f7677057d..3542eeb48e5d 100644
--- a/arch/arm64/kvm/reset.c
+++ b/arch/arm64/kvm/reset.c
@@ -181,11 +181,8 @@ static int kvm_vcpu_enable_ptrauth(struct kvm_vcpu *vcpu)
return 0;
}
-static bool vcpu_allowed_register_width(struct kvm_vcpu *vcpu)
+static bool vcpu_allowed_register_width(struct kvm_vcpu *vcpu, bool is32bit)
{
- bool is32bit;
-
- is32bit = vcpu_has_feature(vcpu, KVM_ARM_VCPU_EL1_32BIT);
if (!cpus_have_const_cap(ARM64_HAS_32BIT_EL1) && is32bit)
return false;
@@ -218,14 +215,27 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
{
struct vcpu_reset_state reset_state;
int ret;
- bool loaded;
+ bool loaded, is32bit;
+ bool allowed = true;
u32 pstate;
+ is32bit = vcpu_has_feature(vcpu, KVM_ARM_VCPU_EL1_32BIT);
+
mutex_lock(&vcpu->kvm->lock);
- reset_state = vcpu->arch.reset_state;
- WRITE_ONCE(vcpu->arch.reset_state.reset, false);
+ if (test_bit(KVM_ARCH_FLAG_REG_WIDTH_CONFIGURED, &vcpu->kvm->arch.flags))
+ allowed = (is32bit == vcpu_el1_is_32bit(vcpu));
+ else
+ allowed = vcpu_allowed_register_width(vcpu, is32bit);
+
+ if (allowed) {
+ reset_state = vcpu->arch.reset_state;
+ WRITE_ONCE(vcpu->arch.reset_state.reset, false);
+ }
mutex_unlock(&vcpu->kvm->lock);
+ if (!allowed)
+ return -EINVAL;
+
/* Reset PMU outside of the non-preemptible section */
kvm_pmu_vcpu_reset(vcpu);
@@ -252,14 +262,9 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
}
}
- if (!vcpu_allowed_register_width(vcpu)) {
- ret = -EINVAL;
- goto out;
- }
-
switch (vcpu->arch.target) {
default:
- if (test_bit(KVM_ARM_VCPU_EL1_32BIT, vcpu->arch.features)) {
+ if (is32bit) {
pstate = VCPU_RESET_PSTATE_SVC;
} else {
pstate = VCPU_RESET_PSTATE_EL1;
--
Thanks,
Reiji
More information about the linux-arm-kernel
mailing list