[PATCH 1/2] KVM: arm64: mixed-width check should be skipped for uninitialized vCPUs
Reiji Watanabe
reijiw at google.com
Wed Jan 12 21:33:21 PST 2022
Hi Marc and Alex,
On Tue, Jan 11, 2022 at 8:11 AM Alexandru Elisei
<alexandru.elisei at arm.com> wrote:
>
> Hi Marc,
>
> On Tue, Jan 11, 2022 at 01:30:41PM +0000, Marc Zyngier wrote:
> > On Tue, 11 Jan 2022 07:37:57 +0000,
> > Reiji Watanabe <reijiw at google.com> wrote:
> > >
> > > Hi Alex,
> > >
> > > On Mon, Jan 10, 2022 at 2:48 AM Alexandru Elisei
> > > <alexandru.elisei at arm.com> wrote:
> > > >
> > > > Hi Reiji,
> > > >
> > > > On Sun, Jan 09, 2022 at 09:40:41PM -0800, Reiji Watanabe wrote:
> > > > > vcpu_allowed_register_width() checks if all the VCPUs are either
> > > > > all 32bit or all 64bit. Since the checking is done even for vCPUs
> > > > > that are not initialized (KVM_ARM_VCPU_INIT has not been done) yet,
> > > > > the non-initialized vCPUs are erroneously treated as 64bit vCPU,
> > > > > which causes the function to incorrectly detect a mixed-width VM.
> > > > >
> > > > > Fix vcpu_allowed_register_width() to skip the check for vCPUs that
> > > > > are not initialized yet.
> > > > >
> > > > > Fixes: 66e94d5cafd4 ("KVM: arm64: Prevent mixed-width VM creation")
> > > > > Signed-off-by: Reiji Watanabe <reijiw at google.com>
> > > > > ---
> > > > > arch/arm64/kvm/reset.c | 11 +++++++++++
> > > > > 1 file changed, 11 insertions(+)
> > > > >
> > > > > diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
> > > > > index 426bd7fbc3fd..ef78bbc7566a 100644
> > > > > --- a/arch/arm64/kvm/reset.c
> > > > > +++ b/arch/arm64/kvm/reset.c
> > > > > @@ -180,8 +180,19 @@ static bool vcpu_allowed_register_width(struct kvm_vcpu *vcpu)
> > > > > if (kvm_has_mte(vcpu->kvm) && is32bit)
> > > > > return false;
> > > > >
> > > > > + /*
> > > > > + * Make sure vcpu->arch.target setting is visible from others so
> > > > > + * that the width consistency checking between two vCPUs is done
> > > > > + * by at least one of them at KVM_ARM_VCPU_INIT.
> > > > > + */
> > > > > + smp_mb();
> > > >
> > > > From ARM DDI 0487G.a, page B2-146 ("Data Memory Barrier (DMB)"):
> > > >
> > > > "The DMB instruction is a memory barrier instruction that ensures the relative
> > > > order of memory accesses before the barrier with memory accesses after the
> > > > barrier."
> > > >
> > > > I'm going to assume from the comment that you are referring to completion of
> > > > memory accesses ("Make sure [..] is visible from others"). Please correct me if
> > > > I am wrong. In this case, DMB ensures ordering of memory accesses with regards
> > > > to writes and reads, not *completion*. Have a look at
> > > > tools/memory-model/litmus-tests/MP+fencewmbonceonce+fencermbonceonce.litmus for
> > > > the classic message passing example as an example of memory ordering.
> > > > Message passing and other patterns are also explained in ARM DDI 0487G.a, page
> > > > K11-8363.
> > > >
> > > > I'm not saying that your approach is incorrect, but the commit message should
> > > > explain what memory accesses are being ordered relative to each other and why.
> > >
> > > Thank you so much for the review.
> > > What I meant with the comment was:
> > > ---
> > > DMB is used to make sure that writing @vcpu->arch.target, which is done
> > > by kvm_vcpu_set_target() before getting here, is visible to other PEs
> > > before the following kvm_for_each_vcpu iteration reads the other vCPUs'
> > > target field.
> > > ---
> > > Did the comment become more clear ?? (Or do I use DMB incorrectly ?)
> > >
> > > > > +
> > > > > /* Check that the vcpus are either all 32bit or all 64bit */
> > > > > kvm_for_each_vcpu(i, tmp, vcpu->kvm) {
> > > > > + /* Skip if KVM_ARM_VCPU_INIT is not done for the vcpu yet */
> > > > > + if (tmp->arch.target == -1)
> > > > > + continue;
> > >
> > > I just noticed DMB(ishld) is needed here to assure ordering between
> > > reading tmp->arch.target and reading vcpu->arch.features for this fix.
> > > Similarly, kvm_vcpu_set_target() needs DMB(ishst) to assure ordering
> > > between writing vcpu->arch.features and writing vcpu->arch.target...
> > > I am going to fix them in the v2 series.
> >
> > Yes, you'd need at least this, and preferably in their smp_rmb/wmb
> > variants.
> >
> > However, this looks like a pretty fragile construct, as there are
> > multiple paths where we can change target (including some error
> > paths from the run loop).
> >
> > I'd rather all changes to target and the feature bits happen under the
> > kvm->lock, and take that lock when checking for consistency in
> > vcpu_allowed_register_width(), as this isn't a fast path. I wrote the
> > following, which is obviously incomplete and as usual untested.
>
> I think this is the better approach, because we also want to make sure that
> a PE observes changes to target and features as soon as they have been
> made, to avoid situations where one PE sets the target and the 32bit
> feature, and another PE reads the old values and skips the check, in which
> case memory ordering is not enough.
Thank you for the comments and the suggestion.
I will look into fixing this based on the suggestion.
Thanks,
Reiji
>
> Thanks,
> Alex
>
> >
> > Thanks,
> >
> > M.
> >
> > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> > index e4727dc771bf..42f2ab80646c 100644
> > --- a/arch/arm64/kvm/arm.c
> > +++ b/arch/arm64/kvm/arm.c
> > @@ -1061,7 +1061,8 @@ int kvm_vm_ioctl_irq_line(struct kvm *kvm, struct kvm_irq_level *irq_level,
> > static int kvm_vcpu_set_target(struct kvm_vcpu *vcpu,
> > const struct kvm_vcpu_init *init)
> > {
> > - unsigned int i, ret;
> > + unsigned int i;
> > + int ret = 0;
> > u32 phys_target = kvm_target_cpu();
> >
> > if (init->target != phys_target)
> > @@ -1074,32 +1075,46 @@ static int kvm_vcpu_set_target(struct kvm_vcpu *vcpu,
> > if (vcpu->arch.target != -1 && vcpu->arch.target != init->target)
> > return -EINVAL;
> >
> > + /* Hazard against a concurent check of the target in kvm_reset_vcpu() */
> > + mutex_lock(&vcpu->kvm->lock);
> > +
> > /* -ENOENT for unknown features, -EINVAL for invalid combinations. */
> > for (i = 0; i < sizeof(init->features) * 8; i++) {
> > bool set = (init->features[i / 32] & (1 << (i % 32)));
> >
> > - if (set && i >= KVM_VCPU_MAX_FEATURES)
> > - return -ENOENT;
> > + if (set && i >= KVM_VCPU_MAX_FEATURES) {
> > + ret = -ENOENT;
> > + break;
> > + }
> >
> > /*
> > * Secondary and subsequent calls to KVM_ARM_VCPU_INIT must
> > * use the same feature set.
> > */
> > if (vcpu->arch.target != -1 && i < KVM_VCPU_MAX_FEATURES &&
> > - test_bit(i, vcpu->arch.features) != set)
> > - return -EINVAL;
> > + test_bit(i, vcpu->arch.features) != set) {
> > + ret = -EINVAL;
> > + break;
> > + }
> >
> > if (set)
> > set_bit(i, vcpu->arch.features);
> > }
> >
> > - vcpu->arch.target = phys_target;
> > + if (!ret)
> > + vcpu->arch.target = phys_target;
> > +
> > + mutex_unlock(&vcpu->kvm->lock);
> > + if (ret)
> > + return ret;
> >
> > /* Now we know what it is, we can reset it. */
> > ret = kvm_reset_vcpu(vcpu);
> > if (ret) {
> > + mutex_lock(&vcpu->kvm->lock);
> > vcpu->arch.target = -1;
> > bitmap_zero(vcpu->arch.features, KVM_VCPU_MAX_FEATURES);
> > + mutex_unlock(&vcpu->kvm->lock);
> > }
> >
> > return ret;
> > diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
> > index ef78bbc7566a..fae88a703140 100644
> > --- a/arch/arm64/kvm/reset.c
> > +++ b/arch/arm64/kvm/reset.c
> > @@ -180,13 +180,6 @@ static bool vcpu_allowed_register_width(struct kvm_vcpu *vcpu)
> > if (kvm_has_mte(vcpu->kvm) && is32bit)
> > return false;
> >
> > - /*
> > - * Make sure vcpu->arch.target setting is visible from others so
> > - * that the width consistency checking between two vCPUs is done
> > - * by at least one of them at KVM_ARM_VCPU_INIT.
> > - */
> > - smp_mb();
> > -
> > /* Check that the vcpus are either all 32bit or all 64bit */
> > kvm_for_each_vcpu(i, tmp, vcpu->kvm) {
> > /* Skip if KVM_ARM_VCPU_INIT is not done for the vcpu yet */
> > @@ -222,14 +215,19 @@ static bool vcpu_allowed_register_width(struct kvm_vcpu *vcpu)
> > int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
> > {
> > struct vcpu_reset_state reset_state;
> > - int ret;
> > + int ret = -EINVAL;
> > bool loaded;
> > u32 pstate;
> >
> > mutex_lock(&vcpu->kvm->lock);
> > - reset_state = vcpu->arch.reset_state;
> > - WRITE_ONCE(vcpu->arch.reset_state.reset, false);
> > + if (vcpu_allowed_register_width(vcpu)) {
> > + reset_state = vcpu->arch.reset_state;
> > + WRITE_ONCE(vcpu->arch.reset_state.reset, false);
> > + ret = 0;
> > + }
> > mutex_unlock(&vcpu->kvm->lock);
> > + if (ret)
> > + goto out;
> >
> > /* Reset PMU outside of the non-preemptible section */
> > kvm_pmu_vcpu_reset(vcpu);
> > @@ -257,11 +255,6 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
> > }
> > }
> >
> > - if (!vcpu_allowed_register_width(vcpu)) {
> > - ret = -EINVAL;
> > - goto out;
> > - }
> > -
> > switch (vcpu->arch.target) {
> > default:
> > if (test_bit(KVM_ARM_VCPU_EL1_32BIT, vcpu->arch.features)) {
> >
> > --
> > Without deviation from the norm, progress is not possible.
More information about the linux-arm-kernel
mailing list