[bug report] KVM: arm64: vgic-v4: Occasionally issue VMOVP to an unmapped VPE on GICv4.1

Marc Zyngier maz at kernel.org
Sun Sep 29 03:07:38 PDT 2024


On Sun, 29 Sep 2024 08:18:41 +0100,
Kunkun Jiang <jiangkunkun at huawei.com> wrote:
> 
> Hi all,
> 
> I found a problem with occasionally issuing VMOVP to an unmapped VPE
> on GICv4.1. In my test environment, operating an unmapped VPE will
> generate RAS, so I found this problem. The detailed analysis is as
> follows.
> 
> The vgic_v4_teardown() will be executed when VM is destroyed to free
> the GICv4 data structures. The code is as follows:
> > /**
> >  * vgic_v4_teardown - Free the GICv4 data structures
> >  * @kvm:        Pointer to the VM being destroyed
> >  */
> > void vgic_v4_teardown(struct kvm *kvm)
> > {
> >         struct its_vm *its_vm = &kvm->arch.vgic.its_vm;
> >         int i;
> > 
> >         lockdep_assert_held(&kvm->arch.config_lock);
> > 
> >         if (!its_vm->vpes)
> >                 return;
> > 
> >         for (i = 0; i < its_vm->nr_vpes; i++) {
> >                 struct kvm_vcpu *vcpu = kvm_get_vcpu(kvm, i);
> >                 int irq = its_vm->vpes[i]->irq;
> > 
> >                 irq_clear_status_flags(irq, DB_IRQ_FLAGS);
> >                 free_irq(irq, vcpu);
> >         }
> > 
> >         its_free_vcpu_irqs(its_vm);
> >         kfree(its_vm->vpes);
> >         its_vm->nr_vpes = 0;
> >         its_vm->vpes = NULL;
> > }
> 
> [1] In irq_clear_status_flags(irq, DB_IRQ_FLAGS), the status flags of
> a doorbell are cleared. DB_IRQ_FLAGS contains IRQ_NO_BALANCING. So
> after this,the irqbalance.service can schedule the doorbell.
> [2] In free_irq(), the VPE is unmaped.
> [3] In its_free_vcpu_irqs(its_vm), unregister_irq_proc() is called to
> delete the contents in /proc/irq/xx/ of the doorbell.
> 
> For VPEs in large-scale VM, there is a centain time window between [2]
> and [3]. The irqbalance.service got a chance to schedule the
> doorbell. Therefore, the VMOVP is issued to an unmapped VPE.
> 
> I tried not clearing IRQ_NO_BALANCING and the problem was solved. But
> it's not clear if there's any other problem with doing so.

I don't think that's a good idea, because whoever request the same
interrupt number again for a different purpose will have the flag set
and will experience odd behaviours.

I'd rather fix it for good, given that we have all the necessary
tracking in place already. Something like the patch below, as usual
untested.

Thanks,

	M.

From c0fe216c50651458fdf1ebb6b3a1e4ffe2bcc6c2 Mon Sep 17 00:00:00 2001
From: Marc Zyngier <maz at kernel.org>
Date: Sun, 29 Sep 2024 10:58:19 +0100
Subject: [PATCH] irqchip/gic-v4: Don't allow a VMOVP on a dying VPE

Kunkun Jiang reports that there is a small window of opportunity for
userspace to force a change of affinity for a VPE while the VPE has
already been unmapped, but the corresponding doorbell interrupt still
visible in /proc/irq/.

Plug the race by checking the value of vmapp_count, which tracks whether
the VPE is mapped ot not, and returning an error in this case.

Reported-by: Kunkun Jiang <jiangkunkun at huawei.com>
Signed-off-by: Marc Zyngier <maz at kernel.org>
Link: https://lore.kernel.org/r/c182ece6-2ba0-ce4f-3404-dba7a3ab6c52@huawei.com
---
 drivers/irqchip/irq-gic-v3-its.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index fdec478ba5e7..ba9734d1a41f 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -3806,6 +3806,13 @@ static int its_vpe_set_affinity(struct irq_data *d,
 	struct cpumask *table_mask;
 	unsigned long flags;
 
+	/*
+	 * Check if we're racing against a VPE being destroyed, for
+	 * which we don't want to allow a VMOVP.
+	 */
+	if (!atomic_read(&vpe->vmapp_count))
+		return -EINVAL;
+
 	/*
 	 * Changing affinity is mega expensive, so let's be as lazy as
 	 * we can and only do it if we really have to. Also, if mapped
-- 
2.43.0


-- 
Without deviation from the norm, progress is not possible.



More information about the linux-arm-kernel mailing list