[PATCH] um: work around sched_yield not yielding in time-travel mode
Benjamin Berg
benjamin at sipsolutions.net
Fri Mar 14 06:08:15 PDT 2025
From: Benjamin Berg <benjamin.berg at intel.com>
sched_yield by a userspace may not actually cause scheduling in
time-travel mode as no time has passed. In the case seen it appears to
be a badly implemented userspace spinlock in ASAN. Unfortunately, with
time-travel it causes an extreme slowdown or even deadlock depending on
the kernel configuration (CONFIG_UML_MAX_USERSPACE_ITERATIONS).
Work around it by accounting time to the process whenever it executes a
sched_yield syscall.
Signed-off-by: Benjamin Berg <benjamin.berg at intel.com>
---
I suspect it is this code in ASAN that uses sched_yield
https://github.com/llvm/llvm-project/blob/main/compiler-rt/lib/sanitizer_common/sanitizer_mutex.cpp
though there are also some other places that use sched_yield.
I doubt that code is reasonable. At the same time, not sure that
sched_yield is behaving as advertised either as it obviously is not
necessarily relinquishing the CPU.
---
arch/um/include/linux/time-internal.h | 2 ++
arch/um/kernel/skas/syscall.c | 11 +++++++++++
2 files changed, 13 insertions(+)
diff --git a/arch/um/include/linux/time-internal.h b/arch/um/include/linux/time-internal.h
index b22226634ff6..138908b999d7 100644
--- a/arch/um/include/linux/time-internal.h
+++ b/arch/um/include/linux/time-internal.h
@@ -83,6 +83,8 @@ extern void time_travel_not_configured(void);
#define time_travel_del_event(...) time_travel_not_configured()
#endif /* CONFIG_UML_TIME_TRAVEL_SUPPORT */
+extern unsigned long tt_extra_sched_jiffies;
+
/*
* Without CONFIG_UML_TIME_TRAVEL_SUPPORT this is a linker error if used,
* which is intentional since we really shouldn't link it in that case.
diff --git a/arch/um/kernel/skas/syscall.c b/arch/um/kernel/skas/syscall.c
index b09e85279d2b..a5beaea2967e 100644
--- a/arch/um/kernel/skas/syscall.c
+++ b/arch/um/kernel/skas/syscall.c
@@ -31,6 +31,17 @@ void handle_syscall(struct uml_pt_regs *r)
goto out;
syscall = UPT_SYSCALL_NR(r);
+
+ /*
+ * If no time passes, then sched_yield may not actually yield, causing
+ * broken spinlock implementations in userspace (ASAN) to hang for long
+ * periods of time.
+ */
+ if ((time_travel_mode == TT_MODE_INFCPU ||
+ time_travel_mode == TT_MODE_EXTERNAL) &&
+ syscall == __NR_sched_yield)
+ tt_extra_sched_jiffies += 1;
+
if (syscall >= 0 && syscall < __NR_syscalls) {
unsigned long ret = EXECUTE_SYSCALL(syscall, regs);
--
2.48.1
More information about the linux-um
mailing list