[PATCH 1/3] arm/arm64: KVM: Fix arch timer behavior for disabled interrupts
Eric Auger
eric.auger at linaro.org
Mon Oct 19 06:27:42 PDT 2015
On 10/19/2015 03:14 PM, Christoffer Dall wrote:
> On Mon, Oct 19, 2015 at 03:07:16PM +0200, Eric Auger wrote:
>> Hi Christoffer,
>> On 10/17/2015 10:30 PM, Christoffer Dall wrote:
>>> We have an interesting issue when the guest disables the timer interrupt
>>> on the VGIC, which happens when turning VCPUs off using PSCI, for
>>> example.
>>>
>>> The problem is that because the guest disables the virtual interrupt at
>>> the VGIC level, we never inject interrupts to the guest and therefore
>>> never mark the interrupt as active on the physical distributor. The
>>> host also never takes the timer interrupt (we only use the timer device
>>> to trigger a guest exit and everything else is done in software), so the
>>> interrupt does not become active through normal means.
>>>
>>> The result is that we keep entering the guest with a programmed timer
>>> that will always fire as soon as we context switch the hardware timer
>>> state and run the guest, preventing forward progress for the VCPU.
>>>
>>> Since the active state on the physical distributor is really part of the
>>> timer logic, it is the job of our virtual arch timer driver to manage
>>> this state.
>>>
>>> The timer->map->active boolean field indicates whether we have signalled
>>> this interrupt to the vgic and if that interrupt is still pending or
>>> active. As long as that is the case, the hardware doesn't have to
>>> generate physical interrupts and therefore we mark the interrupt as
>>> active on the physical distributor.
>>>
>>> Cc: Marc Zyngier <marc.zyngier at arm.com>
>>> Reported-by: Lorenzo Pieralisi <lorenzo.pieralisi at arm.com>
>>> Signed-off-by: Christoffer Dall <christoffer.dall at linaro.org>
>>> ---
>>> virt/kvm/arm/arch_timer.c | 19 +++++++++++++++++++
>>> virt/kvm/arm/vgic.c | 43 +++++++++++--------------------------------
>>> 2 files changed, 30 insertions(+), 32 deletions(-)
>>>
>>> diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
>>> index 48c6e1a..b9d3a32 100644
>>> --- a/virt/kvm/arm/arch_timer.c
>>> +++ b/virt/kvm/arm/arch_timer.c
>>> @@ -137,6 +137,8 @@ bool kvm_timer_should_fire(struct kvm_vcpu *vcpu)
>>> void kvm_timer_flush_hwstate(struct kvm_vcpu *vcpu)
>>> {
>>> struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu;
>>> + bool phys_active;
>>> + int ret;
>>>
>>> /*
>>> * We're about to run this vcpu again, so there is no need to
>>> @@ -151,6 +153,23 @@ void kvm_timer_flush_hwstate(struct kvm_vcpu *vcpu)
>>> */
>>> if (kvm_timer_should_fire(vcpu))
>>> kvm_timer_inject_irq(vcpu);
>>> +
>>> + /*
>>> + * We keep track of whether the edge-triggered interrupt has been
>>> + * signalled to the vgic/guest, and if so, we mask the interrupt and
>>> + * the physical distributor to prevent the timer from raising a
>>> + * physical interrupt whenever we run a guest, preventing forward
>>> + * VCPU progress.
>> In practice don't you simply mark the IRQ as active at GIC physical
>> distributor level, hence preventing the same IRQ from hitting again
>
> yes, that's what I meant with my comment, I should reword to "...we mark
> the interrupt as active on the physical distributor..."
>
>>> + */
>>> + if (kvm_vgic_get_phys_irq_active(timer->map))
>>> + phys_active = true;
>>> + else
>>> + phys_active = false;
>>> +
>>> + ret = irq_set_irqchip_state(timer->map->irq,
>>> + IRQCHIP_STATE_ACTIVE,
>>> + phys_active);
>>
>> physical distributor state is set in arch timer flush. It relates to a
>> shared device behavior so I find it natural to do it there.
>>
>> However the map->active is set in arch_timer IRQ injection and unset in
>> vgic sync. Why not doing the set in kvm_vgic_inject_mapped_irq?
>
> Because you have to set it at every entry to the guest if you run
> multiple VCPUs/VMs on this CPU or migrate this VCPU to a different CPU.
I meant kvm_vgic_set_phys_irq_active(timer->map, true) call in
kvm_timer_inject_irq? Couldn' that been done in
kvm_vgic_inject_mapped_irq instead. Doesn't this apply to all mapped IRQs?
Eric
>
>>
>>> + WARN_ON(ret);
>>> }
>>>
>>> /**
>>> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
>>> index 596455a..ea21bc2 100644
>>> --- a/virt/kvm/arm/vgic.c
>>> +++ b/virt/kvm/arm/vgic.c
>>> @@ -1092,6 +1092,15 @@ static void vgic_retire_lr(int lr_nr, int irq, struct kvm_vcpu *vcpu)
>>> struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
>>> struct vgic_lr vlr = vgic_get_lr(vcpu, lr_nr);
>>>
>>> + /*
>>> + * We must transfer the pending state back to the distributor before
>>> + * retiring the LR, otherwise we may loose edge-triggered interrupts.
>>> + */
>>> + if (vlr.state & LR_STATE_PENDING) {
>>> + vgic_dist_irq_set_pending(vcpu, irq);
>>> + vlr.hwirq = 0;
>>> + }
>> That fix applies to any edge-sensitive IRQ, ie. not especially the
>> timer's one? In the positive shouldn't you precise this in the commit
>> msg too?
>>
>
> Probably, it could also be a separate patch. I'll rework this.
>
> Thanks,
> -Christoffer
>
More information about the linux-arm-kernel
mailing list