[PATCH] KVM: arm64: Clear pending exception state before injecting a new one

Oliver Upton oliver.upton at linux.dev
Mon Jul 14 23:51:09 PDT 2025


Hey,

On Mon, Jul 14, 2025 at 03:46:36PM +0100, Marc Zyngier wrote:
> Repeatedly injecting an exception from userspace without running
> the vcpu between calls results in a nasty warning, as we're not
> really keen on losing already pending exceptions.
> 
> But this precaution doesn't really apply to userspace, who can
> do whatever it wants (within reason). So let's simply clear any
> previous exception state before injecting a new one.
> 
> Note that this is done unconditionally, even if the injection
> ultimately fails.
> 
> Reported-by: syzbot+4e09b1432de3774b86ae at syzkaller.appspotmail.com
> Signed-off-by: Marc Zyngier <maz at kernel.org>

Thanks for taking a look at this. I think the correct fix is a bit more
involved, as:

 - ABI prior to my patches allowed dumb things like injecting both an
   SEA and SError from the same ioctl. With your patch I think you could
   still get the warning to fire with serror_pending && ext_dabt_pending

 - KVM_GET_VCPU_EVENTS is broken for 'pending' SEAs, as we assume
   they're committed in the vCPU state immediately when they're actually
   deferred to the next KVM_RUN.

I thoroughly hate the fix I have but it should address both of these
issues. Although the pending PC adjustment flags seem more like a
liability than anything else if ioctls need to flush them before
returning to userspace. Might look at a larger cleanup down the road.

Thanks,
Oliver

>From 149262689dfe881542f5c5b60f9ee308a00f0596 Mon Sep 17 00:00:00 2001
From: Oliver Upton <oliver.upton at linux.dev>
Date: Mon, 14 Jul 2025 23:25:07 -0700
Subject: [PATCH] KVM: arm64: Commit exceptions from KVM_SET_VCPU_EVENTS
 immediately

syzkaller has found that it can trip a warning in KVM's exception
emulation infrastructure by repeatedly injecting exceptions into the
guest.

While it's unlikely that a reasonable VMM will do this, further
investigation of the issue reveals that KVM can potentially discard the
"pending" SEA state. While the handling of KVM_GET_VCPU_EVENTS presumes
that userspace-injected SEAs are realized immediately, in reality the
emulated exception entry is deferred until the next call to KVM_RUN.

Hack-a-fix the immediate issues by committing the pending exceptions to
the vCPU's architectural state immediately in KVM_SET_VCPU_EVENTS. This
is no different to the way KVM-injected exceptions are handled in
KVM_RUN where we potentially call __kvm_adjust_pc() before returning to
userspace.

Signed-off-by: Oliver Upton <oliver.upton at linux.dev>
---
 arch/arm64/kvm/guest.c | 28 +++++++++++++++++++++++++++-
 1 file changed, 27 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
index e2702718d56d..16ba5e9ac86c 100644
--- a/arch/arm64/kvm/guest.c
+++ b/arch/arm64/kvm/guest.c
@@ -834,6 +834,19 @@ int __kvm_arm_vcpu_get_events(struct kvm_vcpu *vcpu,
 	return 0;
 }
 
+static void commit_pending_events(struct kvm_vcpu *vcpu)
+{
+	if (!vcpu_get_flag(vcpu, PENDING_EXCEPTION))
+		return;
+
+	/*
+	 * Reset the MMIO emulation state to avoid stepping PC after emulating
+	 * the exception entry.
+	 */
+	vcpu->mmio_needed = false;
+	kvm_call_hyp(__kvm_adjust_pc, vcpu);
+}
+
 int __kvm_arm_vcpu_set_events(struct kvm_vcpu *vcpu,
 			      struct kvm_vcpu_events *events)
 {
@@ -843,8 +856,15 @@ int __kvm_arm_vcpu_set_events(struct kvm_vcpu *vcpu,
 	u64 esr = events->exception.serror_esr;
 	int ret = 0;
 
-	if (ext_dabt_pending)
+	/*
+	 * Immediately commit the pending SEA to the vCPU's architectural
+	 * state which is necessary since we do not return a pending SEA
+	 * to userspace via KVM_GET_VCPU_EVENTS.
+	 */
+	if (ext_dabt_pending) {
 		ret = kvm_inject_sea_dabt(vcpu, kvm_vcpu_get_hfar(vcpu));
+		commit_pending_events(vcpu);
+	}
 
 	if (ret < 0)
 		return ret;
@@ -863,6 +883,12 @@ int __kvm_arm_vcpu_set_events(struct kvm_vcpu *vcpu,
 	else
 		ret = kvm_inject_serror(vcpu);
 
+	/*
+	 * We could've decided that the SError is due for immediate software
+	 * injection; commit the exception in case userspace decides it wants
+	 * to inject more exceptions for some strange reason.
+	 */
+	commit_pending_events(vcpu);
 	return (ret < 0) ? ret : 0;
 }
 
-- 
2.39.5



More information about the linux-arm-kernel mailing list