[bug report] GICv4.1: vSGI remains pending across the guest reset

Sun Dec 17 03:26:15 PST 2023

On Thu, 14 Dec 2023 12:13:57 +0000,
Kunkun Jiang <jiangkunkun at huawei.com> wrote:
> 
> Hi list,
> 
> We have observed on GICv4.1 systems that, after a guest reset, the
> secondary VCPU would receive an IPI_CPU_STOP accidently and failed to
> come online eventually.
> 
> | Guest User-space

Guest userspace????

> |
> | reset (with a vSGI pending in the
> | hardware) [0]
> |
> | disable the distributor (write 0
> | into GICD_CTLR) [1]
> |
> | clear pending status (write 0 into
> | GICR_ISPENDR0 for each RD) [2]

This cannot clear the pending bits. Writing 0 to any of the
ISPEND/ICPEND registers is effectively a NOP.

If you want to remove the pending SGIs, you need to write a bunch of
1s to ICPENDR0.

> |
> | disable the distributor (write 0
> | into GICD_CTLR) [3]
> |
> | enable the distributor with ARE,
> | Group1 and nASSGIreq [4]
> 
> The problem is that even if user-space tries to disable the distributor

I don't understand what userspace does here. Why is it significant
that it is an EL0 access? I don't know of any SW doing that, but even
if it existed, this should make no difference.

> and clear pending bits for all SGIs, we don't actually propogate it into
> HW (we only record it via vgic_dist->{enabled, nassgireq} and
> vgic_irq->pending_latch) and the vSGI remains pending across the guest
> reset.
>
> And when we're at [4], all vSGI's vgic_irq->hw are *true* and
> vgic_v4_enable_vsgis() becomes a nop.. That's not good.
> 
> The following solution can solve the problem. Not sure if this is a good
> solution.Sent out early for suggestions or solutions.
> 
> diff --git a/arch/arm64/kvm/vgic/vgic-mmio-v3.c
> b/arch/arm64/kvm/vgic/vgic-mmio-v3.c
> index 89117ba2528a..3678ab33f5b9 100644
> --- a/arch/arm64/kvm/vgic/vgic-mmio-v3.c
> +++ b/arch/arm64/kvm/vgic/vgic-mmio-v3.c
> @@ -374,6 +374,10 @@ static int vgic_v3_uaccess_write_pending(struct
> kvm_vcpu *vcpu,
>              irq->pending_latch = true;
>              vgic_queue_irq_unlock(vcpu->kvm, irq, flags);
>          } else {
> +             if (irq->hw && vgic_irq_is_sgi(irq->intid))
> +                 irq_set_irqchip_state(irq->host_irq,
> +                              IRQCHIP_STATE_PENDING,
> +                              false);
>              irq->pending_latch = false;
>              raw_spin_unlock_irqrestore(&irq->irq_lock, flags);
>          }
> 

But this has *nothing* to do with the guest. This is the *host*
userspace performing a write to the redistributor view, which has
different semantics. Which is why your earlier description made no
sense to me.

I think the problem is slightly larger than what you describe. A write
to ISPENDR0 should be propagated to the ITS for any values of the
latch, just like this happens on enabling HW-backed SGIs.

Can you please give this a go?

Thanks,

	M.

From 9932d74392d969057e84bc3c18bd15f1769b6211 Mon Sep 17 00:00:00 2001
From: Marc Zyngier <maz at kernel.org>
Date: Sun, 17 Dec 2023 11:15:09 +0000
Subject: [PATCH] KVM: arm64: vgic-v4: Restore pending state on host userspace
 write

When the VMM writes to ISPENDR0 to set the state pending state of
an SGI, we fail to convey this to the HW if this SGI is already
backed by a GICv4.1 vSGI.

This is a bit of a corner case, as this would only occur if the
vgic state is changed on an already running VM, but this can
apparently happen across a guest reset driven by the VMM.

Fix this by always writing out the pending_latch value to the
HW, and reseting it to false.

Reported-by: Kunkun Jiang <jiangkunkun at huawei.com>
Signed-off-by: Marc Zyngier <maz at kernel.org>
Link: https://lore.kernel.org/r/7e7f2c0c-448b-10a9-8929-4b8f4f6e2a32@huawei.com
---
 arch/arm64/kvm/vgic/vgic-mmio-v3.c | 27 +++++++++++++++++----------
 1 file changed, 17 insertions(+), 10 deletions(-)

diff --git a/arch/arm64/kvm/vgic/vgic-mmio-v3.c b/arch/arm64/kvm/vgic/vgic-mmio-v3.c
index a764b0ab8bf9..2533f264b954 100644
--- a/arch/arm64/kvm/vgic/vgic-mmio-v3.c
+++ b/arch/arm64/kvm/vgic/vgic-mmio-v3.c
@@ -365,19 +365,26 @@ static int vgic_v3_uaccess_write_pending(struct kvm_vcpu *vcpu,
 		struct vgic_irq *irq = vgic_get_irq(vcpu->kvm, vcpu, intid + i);
 
 		raw_spin_lock_irqsave(&irq->irq_lock, flags);
-		if (test_bit(i, &val)) {
-			/*
-			 * pending_latch is set irrespective of irq type
-			 * (level or edge) to avoid dependency that VM should
-			 * restore irq config before pending info.
-			 */
-			irq->pending_latch = true;
-			vgic_queue_irq_unlock(vcpu->kvm, irq, flags);
-		} else {
+
+		/*
+		 * pending_latch is set irrespective of irq type
+		 * (level or edge) to avoid dependency that VM should
+		 * restore irq config before pending info.
+		 */
+		irq->pending_latch = test_bit(i, &val);
+
+		if (irq->hw && vgic_irq_is_sgi(irq->intid)) {
+			irq_set_irqchip_state(irq->host_irq,
+					      IRQCHIP_STATE_PENDING,
+					      irq->pending_latch);
 			irq->pending_latch = false;
-			raw_spin_unlock_irqrestore(&irq->irq_lock, flags);
 		}
 
+		if (irq->pending_latch)
+			vgic_queue_irq_unlock(vcpu->kvm, irq, flags);
+		else
+			raw_spin_unlock_irqrestore(&irq->irq_lock, flags);
+
 		vgic_put_irq(vcpu->kvm, irq);
 	}
 
-- 
2.39.2


-- 
Without deviation from the norm, progress is not possible.