[PATCH] arm64: KVM: Initialize vGIC before preempt-disabled section in kvm_reset_vcpu()
Marc Zyngier
maz at kernel.org
Tue Apr 21 00:42:57 PDT 2026
On Tue, 21 Apr 2026 02:48:02 +0100,
Deepanshu Kartikey <kartikey406 at gmail.com> wrote:
>
> On Thu, Apr 16, 2026 at 7:50 PM Marc Zyngier <maz at kernel.org> wrote:
> >
> > On Sun, 12 Apr 2026 09:04:37 +0100,
> > Deepanshu Kartikey <kartikey406 at gmail.com> wrote:
> > >
> > > kvm_reset_vcpu() calls kvm_timer_vcpu_reset() inside a preempt-disabled
> > > section to avoid races with preempt notifiers that also call vcpu put/load.
> > >
> > > However, kvm_timer_vcpu_reset() eventually calls kvm_vgic_inject_irq()
> > > which triggers vgic_lazy_init() if the vGIC has not been initialized yet.
> > > vgic_lazy_init() acquires a mutex and calls vgic_init() which invokes
> > > synchronize_srcu_expedited() -- both of which may sleep. Sleeping inside
> > > a preempt-disabled section is illegal and causes:
> > >
> > > BUG: scheduling while atomic: syz.1.49/3699/0x00000002
> > >
> > > Fix this by calling vgic_lazy_init() before preempt_disable(). On the
> > > second call inside kvm_vgic_inject_irq(), vgic_initialized() will return
> > > true and vgic_lazy_init() will return immediately without sleeping.
> > >
> >
> > I think this really goes in the wrong direction. Forcing the vgic (a
> > global resource) to initialise when the vcpu's timer (a local
> > resource) is reset feels at best bizarre. Now you are promoting it to
> > be forced at vcpu reset. This makes things worse.
> >
> > You probably want to take a step back and look at *why* we end-up
> > here. The core reason seems to be that the timer emulation caches the
> > level in a per-timer structure, and tries hard not call into the vgic
> > unless the level changes. Which means that unless the vgic is
> > initialised and is able to latch that state, the initial pending state
> > will not be propagated to the guest.
> >
> > But do we need this optimisation? I don't think so. Other emulated
> > devices don't require it. We can let the vgic know the state of the
> > timer at every vcpu entry, just like we do for other virtual
> > interrupts that the kernel injects (PMU, vgic MI).
> >
> > Once you remove the this cache and the need for the vgic to buffer
> > things outside of normal execution, you can also drop the magic init
> > from the interrupt injection path, because the injection will happen
> > on the run path, just like any other PPI.
> >
> > That'd be a much better approach IMO.
> >
> > Thanks,
> >
> > M.
> >
> > --
> > Without deviation from the norm, progress is not possible.
>
>
> Hi Marc,
>
> Thank you for the detailed feedback! I apologize for the delayed
> response — I was away on holiday.
No need to apologise. I hadn't spotted this report (syzkaller badly
triaged it), so thanks for bring it to my attention.
>
> I understand your point. My fix addresses the symptom rather than the
> root cause. Forcing vGIC (a global resource) to initialize during
> timer (a local resource) reset is not the right approach.
>
> I will take your suggestion and work on:
>
> I will send a v2 once I have something ready.
I've posted my own take on this at [1], for which you are on Cc. Feel
free to review it if you have the time (I'll post a v2 anyway, as I
have accumulated a couple of additional fixes).
Cheers,
M.
[1] https://lore.kernel.org/r/20260417124612.2770268-1-maz@kernel.org
--
Without deviation from the norm, progress is not possible.
More information about the linux-arm-kernel
mailing list