Userspace programs segfaulting since CONFIG_ARM_PAN was introduced for LPAE
Urja
urja at urja.dev
Fri Jul 19 04:09:23 PDT 2024
Hi,
Trying to start KDE Plasma (startplasma-wayland) on my ASUS C201 (RK3288) with any kernel since "ARM: 9358/2: Implement PAN for LPAE by TTBR0 page table walks disablement"* and CONFIG_ARM_PAN=y (enabled by default for an LPAE kernel in that commit) causes a segfault.
I found the issue by trying to test linux 6.10 (so if you're reading this in the future, this was broken from 6.9 to 6.10, atleast).
Trying to analyze the coredump from that segfault with gdb running the same kernel causes gdb to segfault, also. So i assume this affects many userspace programs in this configuration (though not the very basics, i can ssh in, use the shell, tar, etc.).
The backtrace from the core file (rebooted to a non-PAN kernel to get it, of course) is ... a bit confusing to me:
Core was generated by `/usr/bin/startplasma-wayland'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 __pthread_kill_implementation (threadid=<optimized out>, signo=11,
no_tid=<optimized out>) at pthread_kill.c:44
44 return INTERNAL_SYSCALL_ERROR_P (ret) ? INTERNAL_SYSCALL_ERRNO (ret) : 0;
[Current thread is 1 (Thread 0xb04cb180 (LWP 470))]
(gdb) bt
#0 __pthread_kill_implementation
(threadid=<optimized out>, signo=11, no_tid=<optimized out>)
at pthread_kill.c:44
#1 0xb5ad7db0 in __GI_raise (sig=11) at ../sysdeps/posix/raise.c:26
#2 0xb40adf94 in KCrash::defaultCrashHandler (sig=11)
at /usr/src/debug/kcrash/kcrash-5.115.0/src/kcrash.cpp:631
#3 <signal handler called> () at ../sysdeps/unix/sysv/linux/arm/sigrestorer.S
#4 0x00000000 in ??? ()
#5 0xb3d996c0 in jit_machine_stack_exec
(arguments=<optimized out>, executable_func=<optimized out>)
at src/pcre2_jit_match.c:63
#6 0x6b6af900 in ??? ()
(Of note but not the cause is that KDE/KCrash is catching the exception and re-raising it after failing to start DrKonqi.)
To me that looks like something in jit_machine_stack_exec (by pcre2) jumps into a null pointer (Is the problem something to do with it running JIT'd code?).
As a workaround, I disabled CONFIG_ARM_PAN for my kernel (as it effectively was before, too), but I assume a normal, well-behaving userspace should not be affected by it, so that's why i'm reporting this (it also took a 15-stage bisection to find the cause, so ..).
Sorry about being so late to this party (i rarely test mainline kernels), and thanks for reading.
--
Urja Rannikko
(*actually to boot into userspace i need the CC_OPTIMIZE_FOR_SIZE=y fix that arrived 2 commits later, but effectively... it's obvious that this is the real "source".)
More information about the linux-arm-kernel
mailing list