[PATCH v2 07/16] KVM: arm64: vgic-v5: Transfer edge pending state to ICH_PPI_PENDRx_EL2

Wed Apr 1 09:24:55 PDT 2026

On Wed, 2026-04-01 at 11:36 +0100, Marc Zyngier wrote:
> While it is perfectly correct to leave the pending state of a level
> interrupt as is when queuing it (it is, after all, only driven by
> the line), edge pending state must be transfered, as nothing will
> lower it.
> 
> Reviewed-by: Sascha Bischoff <sascha.bischoff at arm.com>
> Fixes: 4d591252bacb2 ("KVM: arm64: gic-v5: Implement PPI interrupt
> injection")
> Link:
> https://sashiko.dev/#/patchset/20260319154937.3619520-1-sascha.bischoff%40arm.com
> Signed-off-by: Marc Zyngier <maz at kernel.org>
> ---
>  arch/arm64/kvm/vgic/vgic-v5.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/kvm/vgic/vgic-v5.c
> b/arch/arm64/kvm/vgic/vgic-v5.c
> index 119d7d01d0e77..422741c86c6a8 100644
> --- a/arch/arm64/kvm/vgic/vgic-v5.c
> +++ b/arch/arm64/kvm/vgic/vgic-v5.c
> @@ -445,8 +445,11 @@ void vgic_v5_flush_ppi_state(struct kvm_vcpu
> *vcpu)
>  
>  		irq = vgic_get_vcpu_irq(vcpu, intid);
>  
> -		scoped_guard(raw_spinlock_irqsave, &irq->irq_lock)
> +		scoped_guard(raw_spinlock_irqsave, &irq->irq_lock) {
>  			__assign_bit(i, pendr, irq_is_pending(irq));
> +			if (irq->config == VGIC_CONFIG_EDGE)
> +				irq->pending_latch = false;
> +		}
>  
>  		vgic_put_irq(vcpu->kvm, irq);
>  	}

With this change we end up losing edges (so, actually have the opposite
problem!). I'd missed this in the previous iteration, but have just
uncovered it during testing. It was hidden by clearing TWI when we have
a single task running, and hence wasn't picked up with the GICv5 PPI
selftest until I stressed the test system more heavily.

In vgic_v5_fold_ppi_state() we detect changes in the active or pending
state on guest exit. If there is no change, i.e., the guest hasn't
consumed the edge yet, we don't end up syncing it back to the
corresponding struct vgic_irq, and lose it forever.

The detection of changes on fold was introduced in order to reduce the
overhead back when we were syncing the state for all 128 potential
PPIs. Now that we've reduced the set to the 64 architected PPIs, and
actually only use a subset of those (SW_PPI, PMUIRQ, timers - up to 4
in total currently) I'm not sure that makes sense anymore.

I think it instead makes sense to iterate over the mask of PPIs exposed
to the guest and sync only those back to KVM's vgic_irq state. This
means that we first of all avoid losing any edges, and secondly are
able to drop the extra pending state tracking.

I've just posted a patch based on this series which addresses the issue
here: 

https://lore.kernel.org/linux-arm-kernel/20260401162152.932243-1-sascha.bischoff@arm.com

Thanks,
Sascha