KVM: Nested VGIC emulation leads to infinite IRQ exceptions
Volodymyr Babchuk
Volodymyr_Babchuk at epam.com
Tue Sep 30 14:11:54 PDT 2025
Hi all,
We are trying to run Xen as KVM nested hypervisor (again!) and we have
encountered strange issue with GIC nested emulation. I am certain that
we'll dig to the root cause, but probably someone on the ML will save us
a couple of days of debugging by providing with some insights.
So, setup is following: QEMU 9.2 is running Xen 4.20 with KVM (latest
Linux master branch) as accelerator. QEMU provides a couple of virtio
devices to the VM and some of these device are passed thought to DomU
(we had to hook these devices to vSMMU, but this is another
story). Sometimes we observe the following sequence of events:
1. DomU gets IRQ from a virtio device
2. DomU acknowledges the IRQ by reading IAR1 register
3. DomU is unable to deactivate the IRQ (there is no write to the
EOI register)
We are not sure why this happens, but our current theory is that DomU's
vvcpu0 is interrupted during handling of the IRQ by Xen's timer
interrupt. Also, we are not able to catch this specific moment in KVM
trace because of lots of lost events. Anyways, after this we are seeing
the following loop:
4. vCPU switches to Xen via IRQ Exception
5. Xen reads IAR1 to get IRQ nr, but gets 1023 (aka no IRQs)
6. Xen issues ERET to return back to guest
7. GOTO 4.
This basically renders the whole vCPU stuck. Also we noticed that DomU's
vvCPU is stuck right after accessing virtio mmio register. So looks like
this is what happens:
1. QEMU sends virtio IRQ to the VM
2. Xen handles the IRQ and injects it into DomU
3. DomU tries to handle it and accesses a virtio mmio register
4. This produces a memory fault that leads to switch back to KVM (and
then to QEMU of course) so QEMU can handle MMIO access
5. When QEMU continues vCPU thread, it immediately gets switched back to
vEL2 (probably due to timer IRQ, but this is my speculation)
6. the vCPU is spinning in the aforementioned loop
Looks like this happens because of empty LRs, but we still didn't
confirmed this because the issue is not 100% reproducible. I'll be glad
to hear any suggestions.
This is a part of the KVM trace, where you can see that vCPU in question
tries to perform ERET to Linux in DomU but is being brought back to
vEL2. In this particular case this is vCPU1 / vvCPU0. I filtered out
other vCPUs to reduce clutter.
qemu-system-aar-41290 [000] d.... 12023.695620: kvm_entry: PC: 0x00000a0000267c80
qemu-system-aar-41290 [000] d.... 12023.695620: vgic_update_irq_pending: VCPU: 1, IRQ 25, level: 0
qemu-system-aar-41290 [000] d.... 12023.695621: kvm_get_timer_map: VCPU: 1, dv: 2, dp: 3, ev: 1, ep: 0
qemu-system-aar-41290 [000] d.... 12023.695621: kvm_timer_emulate: arch_timer_ctx_index: 1 (should_fire: 1)
qemu-system-aar-41290 [000] d.... 12023.695621: kvm_timer_emulate: arch_timer_ctx_index: 0 (should_fire: 0)
qemu-system-aar-41290 [000] ..... 12023.695621: kvm_exit: TRAP: HSR_EC: 0x001a (ERET), PC: 0x00000a00002674e0
qemu-system-aar-41290 [000] ..... 12023.695621: kvm_get_timer_map: VCPU: 1, dv: 2, dp: 3, ev: 1, ep: 0
qemu-system-aar-41290 [000] d.... 12023.695622: kvm_timer_save_state: CTL: 0x000000 CVAL: 0x0 arch_timer_ctx_index: 2
qemu-system-aar-41290 [000] d.... 12023.695622: kvm_timer_save_state: CTL: 0x000005 CVAL: 0x426f7d24736c arch_timer_ctx_index: 3
qemu-system-aar-41290 [000] ..... 12023.695622: kvm_nested_eret: elr_el2: 0xffffffc0010ac5a4 spsr_el2: 0x024000c5 (M: EL1h) hcr_el2: 807c663f
qemu-system-aar-41290 [000] ..... 12023.695622: kvm_get_timer_map: VCPU: 1, dv: 1, dp: 0, ev: 2, ep: 3
qemu-system-aar-41290 [000] ..... 12023.695622: kvm_timer_update_irq: VCPU: 1, IRQ 27, level 1
qemu-system-aar-41290 [000] ..... 12023.695623: vgic_update_irq_pending: VCPU: 1, IRQ 27, level: 1
qemu-system-aar-41290 [000] ..... 12023.695623: kvm_timer_update_irq: VCPU: 1, IRQ 30, level 0
qemu-system-aar-41290 [000] ..... 12023.695623: vgic_update_irq_pending: VCPU: 1, IRQ 30, level: 0
qemu-system-aar-41290 [000] d.... 12023.695623: kvm_timer_restore_state: CTL: 0x000005 CVAL: 0x48aac64bd arch_timer_ctx_index: 1
qemu-system-aar-41290 [000] d.... 12023.695624: kvm_timer_restore_state: CTL: 0x000000 CVAL: 0x0 arch_timer_ctx_index: 0
qemu-system-aar-41290 [000] ..... 12023.695624: kvm_timer_emulate: arch_timer_ctx_index: 2 (should_fire: 0)
qemu-system-aar-41290 [000] ..... 12023.695624: kvm_timer_emulate: arch_timer_ctx_index: 3 (should_fire: 1)
qemu-system-aar-41290 [000] ..... 12023.695626: kvm_get_timer_map: VCPU: 1, dv: 1, dp: 0, ev: 2, ep: 3
qemu-system-aar-41290 [000] d.... 12023.695626: kvm_timer_save_state: CTL: 0x000005 CVAL: 0x48aac64bd arch_timer_ctx_index: 1
qemu-system-aar-41290 [000] d.... 12023.695627: kvm_timer_save_state: CTL: 0x000000 CVAL: 0x0 arch_timer_ctx_index: 0
qemu-system-aar-41290 [000] ..... 12023.695627: kvm_inject_nested_exception: IRQ: esr_el2 0x0 elr_el2: 0xffffffc0010ac5a4 spsr_el2: 0x024000c5 (M: EL1h) hcr_el2: 807c663f
qemu-system-aar-41290 [000] ..... 12023.695627: kvm_get_timer_map: VCPU: 1, dv: 2, dp: 3, ev: 1, ep: 0
qemu-system-aar-41290 [000] ..... 12023.695627: kvm_timer_update_irq: VCPU: 1, IRQ 28, level 0
qemu-system-aar-41290 [000] ..... 12023.695627: vgic_update_irq_pending: VCPU: 1, IRQ 28, level: 0
qemu-system-aar-41290 [000] ..... 12023.695628: kvm_timer_update_irq: VCPU: 1, IRQ 26, level 1
qemu-system-aar-41290 [000] ..... 12023.695628: vgic_update_irq_pending: VCPU: 1, IRQ 26, level: 1
qemu-system-aar-41290 [000] d.... 12023.695628: kvm_timer_restore_state: CTL: 0x000000 CVAL: 0x0 arch_timer_ctx_index: 2
qemu-system-aar-41290 [000] d.... 12023.695628: kvm_timer_restore_state: CTL: 0x000005 CVAL: 0x426f7d24736c arch_timer_ctx_index: 3
qemu-system-aar-41290 [000] ..... 12023.695629: kvm_timer_emulate: arch_timer_ctx_index: 1 (should_fire: 1)
qemu-system-aar-41290 [000] ..... 12023.695629: kvm_timer_emulate: arch_timer_ctx_index: 0 (should_fire: 0)
qemu-system-aar-41290 [000] d.... 12023.695632: vgic_update_irq_pending: VCPU: 1, IRQ 25, level: 0
qemu-system-aar-41290 [000] d.... 12023.695632: vgic_update_irq_pending: VCPU: 1, IRQ 25, level: 0
qemu-system-aar-41290 [000] d.... 12023.695633: vgic_update_irq_pending: VCPU: 1, IRQ 25, level: 0
qemu-system-aar-41290 [000] d.... 12023.695633: kvm_entry: PC: 0x00000a0000267c80
qemu-system-aar-41290 [000] d.... 12023.695634: vgic_update_irq_pending: VCPU: 1, IRQ 25, level: 0
qemu-system-aar-41290 [000] d.... 12023.695634: kvm_get_timer_map: VCPU: 1, dv: 2, dp: 3, ev: 1, ep: 0
qemu-system-aar-41290 [000] d.... 12023.695634: kvm_timer_emulate: arch_timer_ctx_index: 1 (should_fire: 1)
qemu-system-aar-41290 [000] d.... 12023.695635: kvm_timer_emulate: arch_timer_ctx_index: 0 (should_fire: 0)
qemu-system-aar-41290 [000] ..... 12023.695635: kvm_exit: TRAP: HSR_EC: 0x001a (ERET), PC: 0x00000a00002674e0
qemu-system-aar-41290 [000] ..... 12023.695635: kvm_get_timer_map: VCPU: 1, dv: 2, dp: 3, ev: 1, ep: 0
qemu-system-aar-41290 [000] d.... 12023.695635: kvm_timer_save_state: CTL: 0x000000 CVAL: 0x0 arch_timer_ctx_index: 2
qemu-system-aar-41290 [000] d.... 12023.695635: kvm_timer_save_state: CTL: 0x000005 CVAL: 0x426f7d24736c arch_timer_ctx_index: 3
qemu-system-aar-41290 [000] ..... 12023.695636: kvm_nested_eret: elr_el2: 0xffffffc0010ac5a4 spsr_el2: 0x024000c5 (M: EL1h) hcr_el2: 807c663f
qemu-system-aar-41290 [000] ..... 12023.695636: kvm_get_timer_map: VCPU: 1, dv: 1, dp: 0, ev: 2, ep: 3
qemu-system-aar-41290 [000] ..... 12023.695636: kvm_timer_update_irq: VCPU: 1, IRQ 27, level 1
qemu-system-aar-41290 [000] ..... 12023.695636: vgic_update_irq_pending: VCPU: 1, IRQ 27, level: 1
qemu-system-aar-41290 [000] ..... 12023.695636: kvm_timer_update_irq: VCPU: 1, IRQ 30, level 0
qemu-system-aar-41290 [000] ..... 12023.695637: vgic_update_irq_pending: VCPU: 1, IRQ 30, level: 0
qemu-system-aar-41290 [000] d.... 12023.695637: kvm_timer_restore_state: CTL: 0x000005 CVAL: 0x48aac64bd arch_timer_ctx_index: 1
qemu-system-aar-41290 [000] d.... 12023.695637: kvm_timer_restore_state: CTL: 0x000000 CVAL: 0x0 arch_timer_ctx_index: 0
qemu-system-aar-41290 [000] ..... 12023.695637: kvm_timer_emulate: arch_timer_ctx_index: 2 (should_fire: 0)
qemu-system-aar-41290 [000] ..... 12023.695637: kvm_timer_emulate: arch_timer_ctx_index: 3 (should_fire: 1)
qemu-system-aar-41290 [000] ..... 12023.695640: kvm_get_timer_map: VCPU: 1, dv: 1, dp: 0, ev: 2, ep: 3
qemu-system-aar-41290 [000] d.... 12023.695640: kvm_timer_save_state: CTL: 0x000005 CVAL: 0x48aac64bd arch_timer_ctx_index: 1
qemu-system-aar-41290 [000] d.... 12023.695640: kvm_timer_save_state: CTL: 0x000000 CVAL: 0x0 arch_timer_ctx_index: 0
qemu-system-aar-41290 [000] ..... 12023.695640: kvm_inject_nested_exception: IRQ: esr_el2 0x0 elr_el2: 0xffffffc0010ac5a4 spsr_el2: 0x024000c5 (M: EL1h) hcr_el2: 807c663f
qemu-system-aar-41290 [000] ..... 12023.695641: kvm_get_timer_map: VCPU: 1, dv: 2, dp: 3, ev: 1, ep: 0
qemu-system-aar-41290 [000] ..... 12023.695641: kvm_timer_update_irq: VCPU: 1, IRQ 28, level 0
qemu-system-aar-41290 [000] ..... 12023.695641: vgic_update_irq_pending: VCPU: 1, IRQ 28, level: 0
qemu-system-aar-41290 [000] ..... 12023.695641: kvm_timer_update_irq: VCPU: 1, IRQ 26, level 1
qemu-system-aar-41290 [000] ..... 12023.695641: vgic_update_irq_pending: VCPU: 1, IRQ 26, level: 1
qemu-system-aar-41290 [000] d.... 12023.695642: kvm_timer_restore_state: CTL: 0x000000 CVAL: 0x0 arch_timer_ctx_index: 2
qemu-system-aar-41290 [000] d.... 12023.695642: kvm_timer_restore_state: CTL: 0x000005 CVAL: 0x426f7d24736c arch_timer_ctx_index: 3
qemu-system-aar-41290 [000] ..... 12023.695642: kvm_timer_emulate: arch_timer_ctx_index: 1 (should_fire: 1)
qemu-system-aar-41290 [000] ..... 12023.695642: kvm_timer_emulate: arch_timer_ctx_index: 0 (should_fire: 0)
qemu-system-aar-41290 [000] d.... 12023.695644: vgic_update_irq_pending: VCPU: 1, IRQ 25, level: 0
qemu-system-aar-41290 [000] d.... 12023.695645: vgic_update_irq_pending: VCPU: 1, IRQ 25, level: 0
qemu-system-aar-41290 [000] d.... 12023.695645: vgic_update_irq_pending: VCPU: 1, IRQ 25, level: 0
qemu-system-aar-41290 [000] d.... 12023.695646: kvm_entry: PC: 0x00000a0000267c80
qemu-system-aar-41290 [000] d.... 12023.695647: vgic_update_irq_pending: VCPU: 1, IRQ 25, level: 0
qemu-system-aar-41290 [000] d.... 12023.695647: kvm_get_timer_map: VCPU: 1, dv: 2, dp: 3, ev: 1, ep: 0
qemu-system-aar-41290 [000] d.... 12023.695647: kvm_timer_emulate: arch_timer_ctx_index: 1 (should_fire: 1)
qemu-system-aar-41290 [000] d.... 12023.695647: kvm_timer_emulate: arch_timer_ctx_index: 0 (should_fire: 0)
qemu-system-aar-41290 [000] ..... 12023.695647: kvm_exit: TRAP: HSR_EC: 0x001a (ERET), PC: 0x00000a00002674e0
qemu-system-aar-41290 [000] ..... 12023.695648: kvm_get_timer_map: VCPU: 1, dv: 2, dp: 3, ev: 1, ep: 0
qemu-system-aar-41290 [000] d.... 12023.695648: kvm_timer_save_state: CTL: 0x000000 CVAL: 0x0 arch_timer_ctx_index: 2
qemu-system-aar-41290 [000] d.... 12023.695648: kvm_timer_save_state: CTL: 0x000005 CVAL: 0x426f7d24736c arch_timer_ctx_index: 3
qemu-system-aar-41290 [000] ..... 12023.695648: kvm_nested_eret: elr_el2: 0xffffffc0010ac5a4 spsr_el2: 0x024000c5 (M: EL1h) hcr_el2: 807c663f
--
WBR, Volodymyr
More information about the linux-arm-kernel
mailing list