[PATCH v6 04/39] KVM: arm64: vgic: Split out mapping IRQs and setting irq_ops

Wed Mar 18 10:30:12 PDT 2026

On Tue, 2026-03-17 at 16:00 +0000, Marc Zyngier wrote:
> On Tue, 17 Mar 2026 11:40:59 +0000,
> Sascha Bischoff <Sascha.Bischoff at arm.com> wrote:
> > 
> > Prior to this change, the act of mapping a virtual IRQ to a
> > physical
> > one also set the irq_ops. Unmapping then reset the irq_ops to NULL.
> > So
> > far, this has been fine and hasn't caused any major issues.
> > 
> > Now, however, as GICv5 support is being added to KVM, it has become
> > apparent that conflating mapping/unmapping IRQs and
> > setting/clearing
> > irq_ops can cause issues. The reason is that the upcoming GICv5
> > support introduces a set of default irq_ops for PPIs, and removing
> > this when unmapping will cause things to break rather horribly.
> > 
> > Split out the mapping/unmapping of IRQs from the setting/clearing
> > of
> > irq_ops. The arch timer code is updated to set the irq_ops
> > following a
> > successful map. The irq_ops are intentionally not removed again on
> > an
> > unmap as the only irq_op introduced by the arch timer only takes
> > effect if the hw bit in struct vgic_irq is set. Therefore, it is
> > safe
> > to leave this in place, and it avoids additional complexity when
> > GICv5
> > support is introduced.
> > 
> > Signed-off-by: Sascha Bischoff <sascha.bischoff at arm.com>
> > ---
> >  arch/arm64/kvm/arch_timer.c | 32 ++++++++++++++++++-------------
> >  arch/arm64/kvm/vgic/vgic.c  | 38 +++++++++++++++++++++++++++++++--
> > ----
> >  include/kvm/arm_vgic.h      |  5 ++++-
> >  3 files changed, 55 insertions(+), 20 deletions(-)
> > 
> > diff --git a/arch/arm64/kvm/arch_timer.c
> > b/arch/arm64/kvm/arch_timer.c
> > index 600f250753b45..1f536dd5978d4 100644
> > --- a/arch/arm64/kvm/arch_timer.c
> > +++ b/arch/arm64/kvm/arch_timer.c
> > @@ -740,14 +740,17 @@ static void
> > kvm_timer_vcpu_load_nested_switch(struct kvm_vcpu *vcpu,
> >  
> >  		ret = kvm_vgic_map_phys_irq(vcpu,
> >  					    map->direct_vtimer-
> > >host_timer_irq,
> > -					    timer_irq(map-
> > >direct_vtimer),
> > -					    &arch_timer_irq_ops);
> > -		WARN_ON_ONCE(ret);
> > +					    timer_irq(map-
> > >direct_vtimer));
> > +		if (!WARN_ON_ONCE(ret))
> > +			kvm_vgic_set_irq_ops(vcpu, timer_irq(map-
> > >direct_vtimer),
> > +					     &arch_timer_irq_ops);
> > +
> >  		ret = kvm_vgic_map_phys_irq(vcpu,
> >  					    map->direct_ptimer-
> > >host_timer_irq,
> > -					    timer_irq(map-
> > >direct_ptimer),
> > -					    &arch_timer_irq_ops);
> > -		WARN_ON_ONCE(ret);
> > +					    timer_irq(map-
> > >direct_ptimer));
> > +		if (!WARN_ON_ONCE(ret))
> > +			kvm_vgic_set_irq_ops(vcpu, timer_irq(map-
> > >direct_ptimer),
> > +					     &arch_timer_irq_ops);
> 
> Do we really need this eager setting of ops? Given that nothing seems
> to clear them, why can't we just set the ops at vcpu init time? Given
> that this is a pretty hot path (on each exception/exception return
> between L2 and L1), the least we do here, the better.

Hmm, I think you're right. When making this change, I was trying to
preserve the existing behaviour so set the irq_ops for each map call.
However, as you say nothing is clearing the ops (as things stand, at
least), so that does indeed make sense to do IMO.

> 
> >  	}
> >  }
> >  
> > @@ -1565,20 +1568,23 @@ int kvm_timer_enable(struct kvm_vcpu *vcpu)
> >  
> >  	ret = kvm_vgic_map_phys_irq(vcpu,
> >  				    map.direct_vtimer-
> > >host_timer_irq,
> > -				    timer_irq(map.direct_vtimer),
> > -				    &arch_timer_irq_ops);
> > +				    timer_irq(map.direct_vtimer));
> >  	if (ret)
> >  		return ret;
> >  
> > +	kvm_vgic_set_irq_ops(vcpu, timer_irq(map.direct_vtimer),
> > +			     &arch_timer_irq_ops);
> > +
> >  	if (map.direct_ptimer) {
> >  		ret = kvm_vgic_map_phys_irq(vcpu,
> >  					    map.direct_ptimer-
> > >host_timer_irq,
> > -					   
> > timer_irq(map.direct_ptimer),
> > -					    &arch_timer_irq_ops);
> > -	}
> > +					   
> > timer_irq(map.direct_ptimer));
> > +		if (ret)
> > +			return ret;
> >  
> > -	if (ret)
> > -		return ret;
> > +		kvm_vgic_set_irq_ops(vcpu,
> > timer_irq(map.direct_ptimer),
> > +				     &arch_timer_irq_ops);
> > +	}
> 
> which would mean moving this to kvm_timer_vcpu_init().

This, however, is not quite that simple.

It turns out that we actually call kvm_timer_vcpu_init() before
kvm_vgic_vcpu_init() from kvm_arch_vcpu_create(), meaning that we don't
have the private IRQs allocated.

Is there a good reason that the timer (and PMU) are initialised prior
to initialising the CPU?

I've tried making this chang, and once I reorder the timer and vcpu
initialisation I can confirm that things work with and without nested.

> 
> >  
> >  no_vgic:
> >  	timer->enabled = 1;
> > diff --git a/arch/arm64/kvm/vgic/vgic.c
> > b/arch/arm64/kvm/vgic/vgic.c
> > index e22b79cfff965..e37c640d74bcf 100644
> > --- a/arch/arm64/kvm/vgic/vgic.c
> > +++ b/arch/arm64/kvm/vgic/vgic.c
> > @@ -553,10 +553,38 @@ int kvm_vgic_inject_irq(struct kvm *kvm,
> > struct kvm_vcpu *vcpu,
> >  	return 0;
> >  }
> >  
> > +void kvm_vgic_set_irq_ops(struct kvm_vcpu *vcpu, u32 vintid,
> > +			 struct irq_ops *ops)
> > +{
> > +	struct vgic_irq *irq = vgic_get_vcpu_irq(vcpu, vintid);
> > +
> > +	BUG_ON(!irq);
> > +
> > +	scoped_guard(raw_spinlock_irqsave, &irq->irq_lock)
> > +	{
> > +		irq->ops = ops;
> > +	}
> 
> nit: opening brace in the wrong spot, and overall not useful. This
> could simply be written as:
> 
> 	scoped_guard(raw_spinlock_irqsave, &irq->irq_lock)
> 		irq->ops = ops;

Argh, sorry that slipped through!

> 
> > +
> > +	vgic_put_irq(vcpu->kvm, irq);
> > +}
> > +
> > +void kvm_vgic_clear_irq_ops(struct kvm_vcpu *vcpu, u32 vintid)
> > +{
> > +	struct vgic_irq *irq = vgic_get_vcpu_irq(vcpu, vintid);
> > +
> > +	BUG_ON(!irq);
> > +
> > +	scoped_guard(raw_spinlock_irqsave, &irq->irq_lock)
> > +	{
> > +		irq->ops = NULL;
> > +	}
> > +
> > +	vgic_put_irq(vcpu->kvm, irq);
> > +}
> > +
> 
> nit: that could also be written as:
> 
> void kvm_vgic_clear_irq_ops(struct kvm_vcpu *vcpu, u32 vintid)
> {
> 	kvm_vgic_set_irq_ops(vcpu, vintid, NULL);
> }

Ah, that is indeed cleaner.

> 
> I can fix all of it when applying if that works for you.

If you're happy to do that, that is great! Do note what I said above
regarding the order of vcpu and timer init!

Thanks,
Sascha

> 
> Thanks,
> 
> 	M.
>