[PATCH v4 35/49] KVM: arm64: GICv3: nv: Plug L1 LR sync into deactivation primitive
Vishnu Pajjuri
vishnu at os.amperecomputing.com
Wed Apr 22 07:57:44 PDT 2026
Hi Marc,
On 22-04-2026 12:25, Marc Zyngier wrote:
> [+ Darren]
>
> On Tue, 31 Mar 2026 10:42:57 +0100,
> Marc Zyngier <maz at kernel.org> wrote:
>>
>> On Tue, 31 Mar 2026 07:31:54 +0100,
>> Vishnu Pajjuri <vishnu at os.amperecomputing.com> wrote:
>>>
>>>>> LOG:
>>>>> [ 164.647367] Call trace:
>>>>> [ 164.647368] smp_call_function_many_cond+0x334/0x7a0 (P)
>>>>> [ 164.647372] smp_call_function_many+0x20/0x40
>>>>> [ 164.647374] kvm_make_all_cpus_request+0xec/0x1b8
>>>>> [ 164.647377] vgic_queue_irq_unlock+0x1c8/0x2c8
>>>>> [ 164.647380] kvm_vgic_inject_irq+0x194/0x1e0
>>>>> [ 164.647381] kvm_vm_ioctl_irq_line+0x170/0x400
>>>>> [ 164.647386] kvm_vm_ioctl+0x7b8/0xc88
>>>>> [ 164.647389] __arm64_sys_ioctl+0xb4/0x118
>>>>> [ 164.647393] invoke_syscall+0x6c/0x100
>>>>> [ 164.647397] el0_svc_common.constprop.0+0x48/0xf0
>>>>> [ 164.647398] do_el0_svc+0x24/0x38
>>>>> [ 164.647400] el0_svc+0x3c/0x170
>>>>> [ 164.647403] el0t_64_sync_handler+0xa0/0xe8
>>>>> [ 164.647405] el0t_64_sync+0x1b0/0x1b8
>>>>
>>>> This trace is about interrupt injection from userspace, not
>>>> deactivation of a HW interrupt.
>>>> None of that makes much sense.
>>>
>>> Although this behavior is puzzling, it matches the trace I typically
>>> observe on L0. After reverting the patch, I was able to boot L2 guests
>>> successfully.
>>
>> Well, this patch fixes real bugs, so it isn't going anywhere.
>>
>> The patch you are reverting addresses the deactivation of a HW
>> interrupt, which is likely to be a timer (that's the only one we
>> support). The stacktrace points to the userspace injection of an SPI.
>>
>> If we need to broadcast IPI, that's because there is no other SPI
>> currently in flight. But if a CPU is not responding to the IPI, what
>> is it doing? How does this interact with the patch you are reverting?
>>
>> Given that I don't know what you're running, how you are running it,
>> that I don't have access to whatever HW you are using, and that you
>> are providing no useful information that'd help me debug this, I will
>> leave it up to you to debug it and come back with a detailed analysis
>> of the problem.
>
> Have you made progress on this? I can't reproduce it at all despite my
> best effort. I'm perfectly happy to help, but you need to give me
> *something* to go on.
Thanks for your support!!
The issue is triggered as soon as the timer interrupt (IRQ 27) is
deactivated. Preventing the deactivation of IRQ 27 during nested VGIC
state transitions prevents the failure from reproducing.
I am currently tracing execution paths and inspecting VGIC state to
determine how disabling this interrupt leads to the observed behavior.
Regards,
-Vishnu.
> Thanks,
>
> M.
>
More information about the linux-arm-kernel
mailing list