[PATCH 1/2] KVM: arm/arm64: vgic: Don't populate multiple LRs with the same vintid
Marc Zyngier
marc.zyngier at arm.com
Thu Mar 8 09:04:38 PST 2018
On Thu, 08 Mar 2018 16:02:42 +0000,
Christoffer Dall wrote:
>
> On Thu, Mar 08, 2018 at 10:19:49AM +0000, Marc Zyngier wrote:
> > On 07/03/18 23:34, Christoffer Dall wrote:
> > > On Wed, Mar 7, 2018 at 12:40 PM, Marc Zyngier <marc.zyngier at arm.com> wrote:
> > >> The vgic code is trying to be clever when injecting GICv2 SGIs,
> > >> and will happily populate LRs with the same interrupt number if
> > >> they come from multiple vcpus (after all, they are distinct
> > >> interrupt sources).
> > >>
> > >> Unfortunately, this is against the letter of the architecture,
> > >> and the GICv2 architecture spec says "Each valid interrupt stored
> > >> in the List registers must have a unique VirtualID for that
> > >> virtual CPU interface.". GICv3 has similar (although slightly
> > >> ambiguous) restrictions.
> > >>
> > >> This results in guests locking up when using GICv2-on-GICv3, for
> > >> example. The obvious fix is to stop trying so hard, and inject
> > >> a single vcpu per SGI per guest entry. After all, pending SGIs
> > >> with multiple source vcpus are pretty rare, and are mostly seen
> > >> in scenario where the physical CPUs are severely overcomitted.
> > >>
> > >> Cc: stable at vger.kernel.org
> > >> Fixes: 0919e84c0fc1 ("KVM: arm/arm64: vgic-new: Add IRQ sync/flush framework")
> > >> Signed-off-by: Marc Zyngier <marc.zyngier at arm.com>
> > >> ---
> > >> virt/kvm/arm/vgic/vgic.c | 11 +----------
> > >> 1 file changed, 1 insertion(+), 10 deletions(-)
> > >>
> > >> diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
> > >> index c7c5ef190afa..1f7ff175f47b 100644
> > >> --- a/virt/kvm/arm/vgic/vgic.c
> > >> +++ b/virt/kvm/arm/vgic/vgic.c
> > >> @@ -720,18 +720,9 @@ static void vgic_flush_lr_state(struct kvm_vcpu *vcpu)
> > >> list_for_each_entry(irq, &vgic_cpu->ap_list_head, ap_list) {
> > >> spin_lock(&irq->irq_lock);
> > >>
> > >> - if (unlikely(vgic_target_oracle(irq) != vcpu))
> > >> - goto next;
> > >> -
> > >> - /*
> > >> - * If we get an SGI with multiple sources, try to get
> > >> - * them in all at once.
> > >> - */
> > >> - do {
> > >> + if (likely(vgic_target_oracle(irq) == vcpu))
> > >> vgic_populate_lr(vcpu, irq, count++);
> > >
> > > I think we need to change vgic_populate_lr to set the EOI maintenance
> > > interrupt flag so that when the interrupt is deactivated, if there are
> > > additional pending sources, we exit the guest and pick up the
> > > interrupt.
> >
> > Potentially. We need to be careful about about the semantics of EOI MI
> > with non-level interrupts (see the other thread about EOI signalling).
>
> I'll have a look.
>
> >
> > > An alternative would be to set the underflow interrupt, but I don't
> > > think that would be correct for multiple priorities, because the SGI
> > > could have a higher priority than other pending interrupts we put in
> > > the LR.
> >
> > I don't think priorities are the issue (after all, we already sort the
> > irqs in order of priority).
>
> Yes, but assume you have three pending interrupts, one SGI from two
> sources, and one SPI, and assume that the SGI has priority 1 and SPI
> priority 2 (lower means higher priority), then I think with underflow or
> the no-pending interrupt flag, we'll deliver the SGI from the first
> source, and then the SPI, and then exiting to pick up the last SGI from
> the other source. That's not how I understand the GIC architecture is
> supposed to work. Am I missing something?
No, you do have a point here. I'm so glad this is a v2 only thing.
>
> > My worry is that underflow is allowed to
> > fire if there is one interrupt pending, which implies that you could
> > end-up in a livelock scenario if you only have one SGI pending with
> > multiple sources.
>
> Yes, doesn't work, so I think it should be a maintenance interrupt on
> EOI.
>
> >
> > Another possibility would be to use ICH_HCR_EL2.NPIE (GICH_HCR.NPIE on
> > GICv2), which delivers a a MI if no pending interrupts are present. Once
> > the SGI has been activated, we're guaranteed to be able to inject a new
> > pending one.
> >
> > I like the latter, because it doesn't overload the rest of the code with
> > new semantics. Thoughts?
> >
>
> I'm fine with that if I can be proven wrong about the multiple sources
> and priorities.
I'll have a look at respining this with MI on EOI.
Thanks,
M.
--
Jazz is not dead, it just smell funny.
More information about the linux-arm-kernel
mailing list