[PATCH v2 19/36] KVM: arm64: gic-v5: Check for pending PPIs

Thu Jan 8 08:23:48 PST 2026

On Wed, 2026-01-07 at 15:00 +0000, Jonathan Cameron wrote:
> On Fri, 19 Dec 2025 15:52:42 +0000
> Sascha Bischoff <Sascha.Bischoff at arm.com> wrote:
> 
> > This change allows KVM to check for pending PPI interrupts. This
> > has
> > two main components:
> > 
> > First of all, the effective priority mask is calculated.  This is a
> > combination of the priority mask in the VPEs ICC_PCR_EL1.PRIORITY
> > and
> > the currently running priority as determined from the VPE's
> > ICH_APR_EL1. If an interrupt's prioirity is greater than or equal
> > to
> 
> priority
> 
> > the effective priority mask, it can be signalled. Otherwise, it
> > cannot.
> > 
> > Secondly, any Enabled and Pending PPIs must be checked against this
> > compound priority mask. The reqires the PPI priorities to by synced
> > back to the KVM shadow state - this is skipped in general operation
> > as
> > it isn't required and is rather expensive. If any Enabled and
> > Pending
> > PPIs are of sufficient priority to be signalled, then there are
> > pending PPIs. Else, there are not.  This ensures that a VPE is not
> > woken when it cannot actually process the pending interrupts.
> > 
> > Signed-off-by: Sascha Bischoff <sascha.bischoff at arm.com>
> Hi Sascha,
> 
> One thing I notice in here is the use of unsigned long vs u64 is a
> bit
> inconsistent.  When it's a register or something we just read from a
> register
> I'd always use u64.

Yeah, I'd like to do the same. The issue is that the for_each_set_bit()
loop construct only works with unsigned long, and not u64. I'll rework
the code to use u64 wherever possible.

> 
> A few other things inline.
> > ---
> >  arch/arm64/kvm/vgic/vgic-v5.c | 121
> > ++++++++++++++++++++++++++++++++++
> >  arch/arm64/kvm/vgic/vgic.c    |   5 +-
> >  arch/arm64/kvm/vgic/vgic.h    |   1 +
> >  3 files changed, 126 insertions(+), 1 deletion(-)
> > 
> > diff --git a/arch/arm64/kvm/vgic/vgic-v5.c
> > b/arch/arm64/kvm/vgic/vgic-v5.c
> > index cb3dd872d16a6..c7ecc4f40b1e5 100644
> > --- a/arch/arm64/kvm/vgic/vgic-v5.c
> > +++ b/arch/arm64/kvm/vgic/vgic-v5.c
> > @@ -56,6 +56,31 @@ int vgic_v5_probe(const struct gic_kvm_info
> > *info)
> >  	return 0;
> >  }
> >  
> > +static u32 vgic_v5_get_effective_priority_mask(struct kvm_vcpu
> > *vcpu)
> > +{
> > +	struct vgic_v5_cpu_if *cpu_if = &vcpu-
> > >arch.vgic_cpu.vgic_v5;
> > +	u32 highest_ap, priority_mask;
> > +
> > +	/*
> > +	 * Counting the number of trailing zeros gives the current
> > +	 * active priority. Explicitly use the 32-bit version here
> > as
> 
> Short wrap.  I'll stop commenting on these and assume you'll check
> throughout
> (or ignore throughout if you disagree ;) Everyone should use an email
> client with rulers!

Yeah, I'll address these wherever I spot them.

> 
> > +	 * we have 32 priorities. 0x20 then means that there are
> > no
> > +	 * active priorities.
> > +	 */
> > +	highest_ap = cpu_if->vgic_apr ? __builtin_ctz(cpu_if-
> > >vgic_apr) : 32;
> 
> If the comment is going to say 0x20 means no active, then use hex in
> the code
> as well. Or just use 32 in the comment.

Done.

> 
> > +
> > +	/*
> > +	 * An interrupt is of sufficient priority if it is equal
> > to or
> > +	 * greater than the priority mask. Add 1 to the priority
> > mask
> > +	 * (i.e., lower priority) to match the APR logic before
> > taking
> > +	 * the min. This gives us the lowest priority that is
> > masked.
> > +	 */
> > +	priority_mask = FIELD_GET(FEAT_GCIE_ICH_VMCR_EL2_VPMR,
> > cpu_if->vgic_vmcr);
> > +	priority_mask = min(highest_ap, priority_mask + 1);
> > +
> > +	return priority_mask;
> 
> Unless you are going to do more with that in later patches
> 	return min(highest_ap, priority_mask + 1);
> Doesn't lose any significant readability to my eyes.

Agreed, done.

> 
> > +}
> > +
> >  static bool vgic_v5_ppi_set_pending_state(struct kvm_vcpu *vcpu,
> >  					  struct vgic_irq *irq)
> >  {
> > @@ -131,6 +156,102 @@ void vgic_v5_set_ppi_ops(struct vgic_irq
> > *irq)
> >  	}
> >  }
> >  
> > +
> > +/*
> > + * Sync back the PPI priorities to the vgic_irq shadow state
> > + */
> > +static void vgic_v5_sync_ppi_priorities(struct kvm_vcpu *vcpu)
> > +{
> > +	struct vgic_v5_cpu_if *cpu_if = &vcpu-
> > >arch.vgic_cpu.vgic_v5;
> > +	int i, reg;
> > +
> > +	/* We have 16 PPI Priority regs */
> > +	for (reg = 0; reg < 16; reg++) {
> 
> I'd drag the declaration in as
> 	for (int ret = 0;
> 

Done.

> > +		const unsigned long priorityr = cpu_if-
> > >vgic_ppi_priorityr[reg];
> > +
> > +		for (i = 0; i < 8; ++i) {
> similar for int i = 0 here
> 
> Kernel style is getting more accepting of these 'modern' style things
> ;)
> Up to you though if you prefer old school.
> 
> > +			struct vgic_irq *irq;
> > +			u32 intid;
> > +			u8 priority;
> > +
> > +			priority = (priorityr >> (i * 8)) & 0x1f;
> 
> GENMASK(4, 0); maybe.  It's short enough (I can count to 1 f easily
> enough!)
> that I don't really mind which style you use for this.

Frankly, that makes the intent clearer to me so that's better.

> 
> > +
> > +			intid = FIELD_PREP(GICV5_HWIRQ_TYPE,
> > GICV5_HWIRQ_TYPE_PPI);
> > +			intid |= FIELD_PREP(GICV5_HWIRQ_ID, reg *
> > 8 + i);
> > +
> > +			irq = vgic_get_vcpu_irq(vcpu, intid);
> > +
> > +			scoped_guard(raw_spinlock, &irq->irq_lock)
> > +				irq->priority = priority;
> > +
> > +			vgic_put_irq(vcpu->kvm, irq);
> > +		}
> > +	}
> > +}
> > +
> > +bool vgic_v5_has_pending_ppi(struct kvm_vcpu *vcpu)
> > +{
> > +	struct vgic_v5_cpu_if *cpu_if = &vcpu-
> > >arch.vgic_cpu.vgic_v5;
> > +	int i, reg;
> > +	unsigned int priority_mask;
> > +
> > +	/* If no pending bits are set, exit early */
> > +	if (likely(!cpu_if->vgic_ppi_pendr[0] && !cpu_if-
> > >vgic_ppi_pendr[1]))
> 
> That likely seems a little bit dubious. I'd be tempted to not mark
> this
> unless you have stats on running systems where the predictors get it
> wrong
> enough that the mark is useful.

OK, I've dropped that. Something to revisit once we have some hardware.

> 
> > +		return false;
> > +
> > +	priority_mask = vgic_v5_get_effective_priority_mask(vcpu);
> > +
> > +	/* If the combined priority mask is 0, nothing can be
> > signalled! */
> > +	if (!priority_mask)
> > +		return false;
> > +
> > +	/* The shadow priority is only updated on demand, sync it
> > across first */
> > +	vgic_v5_sync_ppi_priorities(vcpu);
> > +
> > +	for (reg = 0; reg < 2; reg++) {
> > +		unsigned long possible_bits;
> > +		const unsigned long enabler = cpu_if-
> > >vgic_ich_ppi_enabler_exit[reg];
> Given storage of vgic_ich_ppi_enabler_exit[reg] is a u64 and you are
> going to
> use that length explicitly (the 64 in the bitmap walk below) I'd make
> these
> u64s.  I've not really been keeping an eye open for this in other
> patches, so
> maybe look for other cases where an explicit length is clearer.
> u64 shorter as well!

Yeah, I'm making sure to use u64 wherever possible, with he exception
being for the bit-based loops.

> 
> > +		const unsigned long pendr = cpu_if-
> > >vgic_ppi_pendr_exit[reg];
> > +		bool has_pending = false;
> > +
> > +		/* Check all interrupts that are enabled and
> > pending */
> > +		possible_bits = enabler & pendr;
> > +
> > +		/*
> > +		 * Optimisation: pending and enabled with no
> > active priorities
> > +		 */
> > +		if (possible_bits && priority_mask > 0x1f)
> 
> I 'think' priority_mask > 0x1f is always 0x20?  I'd match that
> explicitly so the
> relationship to the magic value comment above is obvious

Yeah, I've just changed that to explicitly check for 32 (matching what
I did in the other patch that introduces the effective priority
calculation).

Thanks,
Sascha

> 
> > +			return true;
> > +
> > +		for_each_set_bit(i, &possible_bits, 64) {
> > +			struct vgic_irq *irq;
> > +			u32 intid;
> > +
> > +			intid = FIELD_PREP(GICV5_HWIRQ_TYPE,
> > GICV5_HWIRQ_TYPE_PPI);
> > +			intid |= FIELD_PREP(GICV5_HWIRQ_ID, reg *
> > 64 + i);
> > +
> > +			irq = vgic_get_vcpu_irq(vcpu, intid);
> > +
> > +			scoped_guard(raw_spinlock, &irq->irq_lock)
> > {
> > +				/*
> > +				 * We know that the interrupt is
> > +				 * enabled and pending, so only
> > check
> > +				 * the priority.
> > +				 */
> > +				if (irq->priority <=
> > priority_mask)
> > +					has_pending = true;
> > +			}
> > +
> > +			vgic_put_irq(vcpu->kvm, irq);
> > +
> > +			if (has_pending)
> > +				return true;
> > +		}
> > +	}
> > +
> > +	return false;
> > +}
> > +
> >  /*
> >   * Detect any PPIs state changes, and propagate the state with
> > KVM's
> >   * shadow structures.
> > diff --git a/arch/arm64/kvm/vgic/vgic.c
> > b/arch/arm64/kvm/vgic/vgic.c
> > index cb5d43b34462b..dfec6ed7936ed 100644
> > --- a/arch/arm64/kvm/vgic/vgic.c
> > +++ b/arch/arm64/kvm/vgic/vgic.c
> > @@ -1180,9 +1180,12 @@ int kvm_vgic_vcpu_pending_irq(struct
> > kvm_vcpu *vcpu)
> >  	unsigned long flags;
> >  	struct vgic_vmcr vmcr;
> >  
> > -	if (!vcpu->kvm->arch.vgic.enabled)
> > +	if (!vcpu->kvm->arch.vgic.enabled && !vgic_is_v5(vcpu-
> > >kvm))
> >  		return false;
> >  
> > +	if (vcpu->kvm->arch.vgic.vgic_model ==
> > KVM_DEV_TYPE_ARM_VGIC_V5)
> > +		return vgic_v5_has_pending_ppi(vcpu);
> > +
> >  	if (vcpu->arch.vgic_cpu.vgic_v3.its_vpe.pending_last)
> >  		return true;
> >  
> > diff --git a/arch/arm64/kvm/vgic/vgic.h
> > b/arch/arm64/kvm/vgic/vgic.h
> > index 978d7f8426361..65c031da83e78 100644
> > --- a/arch/arm64/kvm/vgic/vgic.h
> > +++ b/arch/arm64/kvm/vgic/vgic.h
> > @@ -388,6 +388,7 @@ int vgic_v5_probe(const struct gic_kvm_info
> > *info);
> >  void vgic_v5_get_implemented_ppis(void);
> >  void vgic_v5_set_ppi_ops(struct vgic_irq *irq);
> >  int vgic_v5_set_ppi_dvi(struct kvm_vcpu *vcpu, u32 irq, bool dvi);
> > +bool vgic_v5_has_pending_ppi(struct kvm_vcpu *vcpu);
> >  void vgic_v5_flush_ppi_state(struct kvm_vcpu *vcpu);
> >  void vgic_v5_fold_ppi_state(struct kvm_vcpu *vcpu);
> >  void vgic_v5_load(struct kvm_vcpu *vcpu);
>