[PATCH v2 16/39] KVM: arm64: gic-v5: Request doorbells when VPEs enter WFI
Sascha Bischoff
Sascha.Bischoff at arm.com
Thu May 21 07:54:37 PDT 2026
When a GICv5 VPE is made non-resident as part of the vcpu entering
WFI, request a VPE doorbell so that KVM can be notified when a
suitable SPI or LPI becomes pending for that VPE.
Program the doorbell priority mask, DBPM, from the effective virtual
priority mask before making the VPE non-resident. DBPM is the priority
threshold used by the GICv5 hardware to decide whether a pending SPI
or LPI is allowed to signal the VPE doorbell. This allows hardware to
signal the doorbell only for interrupts that the vcpu can actually
take, and avoids waking it for interrupts masked by the guest priority
state. If no interrupt can be signalled to the vcpu, leave the
doorbell request clear.
Make the doorbell interrupt affine to the current CPU before
requesting it. This nudges the wakeup back towards the CPU that last
ran the vcpu, where the relevant state is more likely to be cache-hot,
while also spreading doorbell interrupts across host PEs as different
vcpus enter WFI on different CPUs.
Clear stale db_fired state before making the VPE non-resident. Any
previous doorbell notification has already been consumed by this
point, and clearing it before the non-resident transition ensures that
a newly fired doorbell is observed.
Finally, teach kvm_vgic_vcpu_pending_irq() to report pending work for
a GICv5 vcpu when its VPE doorbell has fired, in addition to the
existing pending-PPI check.
Signed-off-by: Sascha Bischoff <sascha.bischoff at arm.com>
---
arch/arm64/kvm/hyp/vgic-v5-sr.c | 9 ++++++++
arch/arm64/kvm/vgic/vgic-v5.c | 40 +++++++++++++++++++++++++++++++++
arch/arm64/kvm/vgic/vgic.c | 6 ++++-
3 files changed, 54 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/kvm/hyp/vgic-v5-sr.c b/arch/arm64/kvm/hyp/vgic-v5-sr.c
index f064045a31aee..46992a6c2cacb 100644
--- a/arch/arm64/kvm/hyp/vgic-v5-sr.c
+++ b/arch/arm64/kvm/hyp/vgic-v5-sr.c
@@ -22,6 +22,15 @@ void __vgic_v5_make_resident(struct vgic_v5_cpu_if *cpu_if)
void __vgic_v5_make_non_resident(struct vgic_v5_cpu_if *cpu_if)
{
+ /*
+ * Clear the db_fired state to ensure that we're ready for the next
+ * doorbell when it is requested. If a doorbell firing caused us to
+ * enter the guest, then we've already consumed that state at this
+ * point, so this is safe to clear. Use WRITE_ONCE() to ensure we're not
+ * racing with the doorbell firing and setting the state true again.
+ */
+ WRITE_ONCE(cpu_if->gicv5_vpe.db_fired, false);
+
/*
* Make as non-resident before actually making non-resident. Avoids race
* with doorbell arriving.
diff --git a/arch/arm64/kvm/vgic/vgic-v5.c b/arch/arm64/kvm/vgic/vgic-v5.c
index 25590cf5ebee1..b966495901cc4 100644
--- a/arch/arm64/kvm/vgic/vgic-v5.c
+++ b/arch/arm64/kvm/vgic/vgic-v5.c
@@ -1079,6 +1079,46 @@ void vgic_v5_put(struct kvm_vcpu *vcpu)
kvm_call_hyp(__vgic_v5_save_apr, cpu_if);
cpu_if->vgic_contextr = 0;
+ if (vcpu_get_flag(vcpu, IN_WFI)) {
+ u32 priority_mask;
+ int dbpm;
+
+ /*
+ * Find the virtual running priority and use this to calculate
+ * the doorbell priority mask. We combine the highest active
+ * priority and the CPU's priority mask. The guest can't handle
+ * interrupts with priorities less than or equal to the virtual
+ * running priority, so there's literally no point in waking the
+ * guest for these.
+ *
+ * The priority needs to be higher than the mask to signal, so
+ * pick the next higher priority (subtract 1).
+ */
+ priority_mask = vgic_v5_get_effective_priority_mask(vcpu);
+
+ /*
+ * Request a doorbell *unless* the priority is 0, indicating
+ * that no interrupt can wake the CPU up.
+ */
+ if (priority_mask) {
+ int db_irq = vgic_v5_vpe_db(vcpu);
+ struct irq_data *d = irq_get_irq_data(db_irq);
+ const struct cpumask *aff = irq_data_get_effective_affinity_mask(d);
+ int cpu = smp_processor_id();
+
+ dbpm = priority_mask - 1;
+ cpu_if->vgic_contextr = FIELD_PREP(ICH_CONTEXTR_EL2_DB, 1) |
+ FIELD_PREP(ICH_CONTEXTR_EL2_DBPM, dbpm);
+
+ /*
+ * Make the doorbell affine to this CPU, if it isn't
+ * already. Actively check the cpumask first as it is
+ * cheaper than changing the affinity every time.
+ */
+ if (!cpumask_test_cpu(cpu, aff))
+ WARN_ON(irq_set_affinity(db_irq, cpumask_of(cpu)));
+ }
+ }
kvm_call_hyp(__vgic_v5_make_non_resident, cpu_if);
diff --git a/arch/arm64/kvm/vgic/vgic.c b/arch/arm64/kvm/vgic/vgic.c
index b697678d68b01..d56e87a0d2acc 100644
--- a/arch/arm64/kvm/vgic/vgic.c
+++ b/arch/arm64/kvm/vgic/vgic.c
@@ -1229,8 +1229,12 @@ int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu)
unsigned long flags;
struct vgic_vmcr vmcr;
- if (vgic_is_v5(vcpu->kvm))
+ if (vgic_is_v5(vcpu->kvm)) {
+ if (READ_ONCE(vcpu->arch.vgic_cpu.vgic_v5.gicv5_vpe.db_fired))
+ return true;
+
return vgic_v5_has_pending_ppi(vcpu);
+ }
if (!vcpu->kvm->arch.vgic.enabled)
return false;
--
2.34.1
More information about the linux-arm-kernel
mailing list