[PATCH v6 19/39] KVM: arm64: gic-v5: Implement PPI interrupt injection

Tue Mar 17 09:31:23 PDT 2026

On Tue, 17 Mar 2026 11:44:55 +0000,
Sascha Bischoff <Sascha.Bischoff at arm.com> wrote:
> 
> This change introduces interrupt injection for PPIs for GICv5-based
> guests.
> 
> The lifecycle of PPIs is largely managed by the hardware for a GICv5
> system. The hypervisor injects pending state into the guest by using
> the ICH_PPI_PENDRx_EL2 registers. These are used by the hardware to
> pick a Highest Priority Pending Interrupt (HPPI) for the guest based
> on the enable state of each individual interrupt. The enable state and
> priority for each interrupt are provided by the guest itself (through
> writes to the PPI registers).
> 
> When Direct Virtual Interrupt (DVI) is set for a particular PPI, the
> hypervisor is even able to skip the injection of the pending state
> altogether - it all happens in hardware.
> 
> The result of the above is that no AP lists are required for GICv5,
> unlike for older GICs. Instead, for PPIs the ICH_PPI_* registers
> fulfil the same purpose for all 128 PPIs. Hence, as long as the
> ICH_PPI_* registers are populated prior to guest entry, and merged
> back into the KVM shadow state on exit, the PPI state is preserved,
> and interrupts can be injected.
> 
> When injecting the state of a PPI the state is merged into the
> PPI-specific vgic_irq structure. The PPIs are made pending via the
> ICH_PPI_PENDRx_EL2 registers, the value of which is generated from the
> vgic_irq structures for each PPI exposed on guest entry. The
> queue_irq_unlock() irq_op is required to kick the vCPU to ensure that
> it seems the new state. The result is that no AP lists are used for
> private interrupts on GICv5.
> 
> Prior to entering the guest, vgic_v5_flush_ppi_state() is called from
> kvm_vgic_flush_hwstate(). This generates the pending state to inject
> into the guest, and snapshots it (twice - an entry and an exit copy)
> in order to track any changes. These changes can come from a guest
> consuming an interrupt or from a guest making an Edge-triggered
> interrupt pending.
> 
> When returning from running a guest, the guest's PPI state is merged
> back into KVM's vgic_irq state in vgic_v5_merge_ppi_state() from
> kvm_vgic_sync_hwstate(). The Enable and Active state is synced back for
> all PPIs, and the pending state is synced back for Edge PPIs (Level is
> driven directly by the devices generating said levels). The incoming
> pending state from the guest is merged with KVM's shadow state to
> avoid losing any incoming interrupts.
> 
> Signed-off-by: Sascha Bischoff <sascha.bischoff at arm.com>
> Reviewed-by: Jonathan Cameron <jonathan.cameron at huawei.com>
> ---
>  arch/arm64/kvm/vgic/vgic-v5.c | 143 ++++++++++++++++++++++++++++++++++
>  arch/arm64/kvm/vgic/vgic.c    |  41 ++++++++--
>  arch/arm64/kvm/vgic/vgic.h    |  25 +++---
>  3 files changed, 194 insertions(+), 15 deletions(-)
> 
> diff --git a/arch/arm64/kvm/vgic/vgic-v5.c b/arch/arm64/kvm/vgic/vgic-v5.c
> index 07f416fbc4bc8..e080fce61dc35 100644
> --- a/arch/arm64/kvm/vgic/vgic-v5.c
> +++ b/arch/arm64/kvm/vgic/vgic-v5.c
> @@ -122,6 +122,149 @@ int vgic_v5_finalize_ppi_state(struct kvm *kvm)
>  	return 0;
>  }
>  
> +/*
> + * For GICv5, the PPIs are mostly directly managed by the hardware. We (the
> + * hypervisor) handle the pending, active, enable state save/restore, but don't
> + * need the PPIs to be queued on a per-VCPU AP list. Therefore, sanity check the
> + * state, unlock, and return.
> + */
> +static bool vgic_v5_ppi_queue_irq_unlock(struct kvm *kvm, struct vgic_irq *irq,
> +					 unsigned long flags)
> +	__releases(&irq->irq_lock)
> +{
> +	struct kvm_vcpu *vcpu;
> +
> +	lockdep_assert_held(&irq->irq_lock);
> +
> +	if (WARN_ON_ONCE(!__irq_is_ppi(KVM_DEV_TYPE_ARM_VGIC_V5, irq->intid)))
> +		goto out_unlock_fail;
> +
> +	vcpu = irq->target_vcpu;
> +	if (WARN_ON_ONCE(!vcpu))
> +		goto out_unlock_fail;
> +
> +	raw_spin_unlock_irqrestore(&irq->irq_lock, flags);
> +
> +	/* Directly kick the target VCPU to make sure it sees the IRQ */
> +	kvm_make_request(KVM_REQ_IRQ_PENDING, vcpu);
> +	kvm_vcpu_kick(vcpu);
> +
> +	return true;
> +
> +out_unlock_fail:
> +	raw_spin_unlock_irqrestore(&irq->irq_lock, flags);
> +
> +	return false;
> +}
> +
> +static struct irq_ops vgic_v5_ppi_irq_ops = {
> +	.queue_irq_unlock = vgic_v5_ppi_queue_irq_unlock,
> +};
> +
> +void vgic_v5_set_ppi_ops(struct vgic_irq *irq)
> +{
> +	if (WARN_ON(!irq))
> +		return;
> +
> +	guard(raw_spinlock_irqsave)(&irq->irq_lock);
> +
> +	if (!WARN_ON(irq->ops))
> +		irq->ops = &vgic_v5_ppi_irq_ops;
> +}

Why isn't this a call to kvm_vgic_set_irq_ops()? It feels very odd to
have two ways to do the same thing.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.