[PATCH] KVM: arm64/sve: Ensure SVE is trapped after guest exit

Mark Rutland mark.rutland at arm.com
Tue Jan 21 02:00:26 PST 2025


There is a period of time after returning from a KVM_RUN ioctl where
userspace may use SVE without trapping, but the kernel can unexpectedly
discard the live SVE state. Eric Auger has observed this causing QEMU
crashes where SVE is used by memmove():

  https://issues.redhat.com/browse/RHEL-68997

The only state discarded is the user SVE state of the task which issued
the KVM_RUN ioctl. Other tasks are unaffected, plain FPSIMD state is
unaffected, and kernel state is unaffected.

This happens because fpsimd_kvm_prepare() incorrectly manipulates the
FPSIMD/SVE state. When the vCPU is loaded, fpsimd_kvm_prepare()
unconditionally clears TIF_SVE but does not reconfigure CPACR_EL1.ZEN to
trap userspace SVE usage. If the vCPU does not use FPSIMD/SVE and hyp
does not save the host's FPSIMD/SVE state, the kernel may return to
userspace with TIF_SVE clear while SVE is still enabled in
CPACR_EL1.ZEN. Subsequent userspace usage of SVE will not be trapped,
and the next save of userspace FPSIMD/SVE state will only store the
FPSIMD portion due to TIF_SVE being clear, discarding any SVE state.

The broken logic was originally introduced in commit:

  93ae6b01bafee8fa ("KVM: arm64: Discard any SVE state when entering KVM guests")

... though at the time fp_user_discard() would reconfigure CPACR_EL1.ZEN
to trap subsequent SVE usage, masking the issue until that logic was
removed in commit:

  8c845e2731041f0f ("arm64/sve: Leave SVE enabled on syscall if we don't context switch")

Avoid this issue by reconfiguring CPACR_EL1.ZEN when clearing
TIF_SVE. At the same time, add a comment to explain why
current->thread.fp_type must be set even though the FPSIMD state is not
foreign. A similar issue exists when SME is enabled, and will require
further rework. As SME currently depends on BROKEN, a BUILD_BUG() and
comment are added for now, and this issue will need to be fixed properly
in a follow-up patch.

Commit 93ae6b01bafee8fa also introduced an unintended ptrace ABI change.
Unconditionally clearing TIF_SVE regardless of whether the state is
foreign discards saved SVE state created by ptrace after syscall entry.
Avoid this by only clearing TIF_SVE when the FPSIMD/SVE state is not
foreign. When the state is foreign, KVM hyp code does not need to save
any host state, and so this will not affect KVM.

There appear to be further issues with unintentional SVE state
discarding, largely impacting ptrace and signal handling, which will
need to be addressed in separate patches.

Reported-by: Eric Auger <eauger at redhat.com>
Reported-by: Wilco Dijkstra <wilco.dijkstra at arm.com>
Cc: stable at vger.kernel.org
Cc: Catalin Marinas <catalin.marinas at arm.com>
Cc: Florian Weimer <fweimer at redhat.com>
Cc: Jeremy Linton <jeremy.linton at arm.com>
Cc: Marc Zyngier <maz at kernel.org>
Cc: Mark Brown <broonie at kernel.org>
Cc: Oliver Upton <oliver.upton at linux.dev>
Cc: Paolo Bonzini <pbonzini at redhat.com>
Cc: Will Deacon <will at kernel.org>
Signed-off-by: Mark Rutland <mark.rutland at arm.com>
---
 arch/arm64/kernel/fpsimd.c | 20 ++++++++++++++++++--
 1 file changed, 18 insertions(+), 2 deletions(-)

I believe there are some other issues in this area, but I'm sending this
out on its own because I beleive the other issues are more complex while
this is self-contained, and people are actively hitting this case in
production.

I intend to follow-up with fixes for the other cases I mention in the
commit message, and for the SME case with the BUILD_BUG_ON().

Mark.

diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index 8c4c1a2186cc5..e4053a90ed240 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -1711,8 +1711,24 @@ void fpsimd_kvm_prepare(void)
 	 */
 	get_cpu_fpsimd_context();
 
-	if (test_and_clear_thread_flag(TIF_SVE)) {
-		sve_to_fpsimd(current);
+	if (!test_thread_flag(TIF_FOREIGN_FPSTATE) &&
+	    test_and_clear_thread_flag(TIF_SVE)) {
+		sve_user_disable();
+
+		/*
+		 * The KVM hyp code doesn't set fp_type when saving the host's
+		 * FPSIMD state. Set fp_type here in case the hyp code saves
+		 * the host state.
+		 *
+		 * If hyp code does not save the host state, then the host
+		 * state remains live on the CPU and saved fp_type is
+		 * irrelevant until it is overwritten by a later call to
+		 * fpsimd_save_user_state().
+		 *
+		 * This is *NOT* sufficient when CONFIG_ARM64_SME=y, where
+		 * fp_type can be FP_STATE_SVE regardless of TIF_SVE.
+		 */
+		BUILD_BUG_ON(IS_ENABLED(CONFIG_ARM64_SME));
 		current->thread.fp_type = FP_STATE_FPSIMD;
 	}
 
-- 
2.30.2




More information about the linux-arm-kernel mailing list