[PATCH v4 5/5] KVM: arm64: vgic-its: Clear ITE when DISCARD frees an ITE
Marc Zyngier
maz at kernel.org
Fri May 16 02:52:06 PDT 2025
On Mon, 12 May 2025 15:09:09 +0100,
David Sauerwein <dssauerw at amazon.de> wrote:
>
> Hi Jing,
>
> After pulling this patch in via the v6.6.64 and v5.10.226 LTS releases, I see
> NULL pointer dereferences in some guests. The dereference happens in different
> parts of the kernel outside of the GIC driver (file systems, NVMe driver,
> etc.). The issue only appears once every few hundred DISCARDs / guest boots.
> Reverting the commit does fix the problem. I have seen multiple different guest
> kernel versions (4.14, 5.15) and distributions exhibit this issue.
Where is the guest stack trace?
> The issue looks like some kind of race. I think the guest re-uses the memory
> allocated for the ITT before the hypervisor is actually done with the DISCARD
> command, i.e. before it zeros the ITE. From what I can tell, the guest should
> wait for the command to finish via its_wait_for_range_completion(). I tried
> locking reads to its->cwriter in vgic_mmio_read_its_cwriter() and its->creadr
> in vgic_mmio_read_its_creadr() with its->cmd_lock in the hypervisor kernel, but
> that did not help. I also instrumented the guest kernel both via printk() and
> trace events. In both cases the issue disappears once the instrumentation is in
> place, so I'm not able to fully observe what is happening on the guest side.
>
> Do you have an idea of what might cause the issue?
I'm a bit sceptical of this analysis, because KVM makes no use of the
guest's owned memory outside of a save/restore event, and otherwise
shadows everything.
So what are you *exactly* doing here? Have you reproduced this with an
upstream, current KVM host?
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
More information about the linux-arm-kernel
mailing list