cacheflush completely broken, suspecting PAN+LPAE
Michał Pecio
michal.pecio at gmail.com
Mon Nov 11 14:38:17 PST 2024
Hi,
I installed v6.11.5 on Tegra K1 (Cortex-A15) with tegra_defconfing +
CONFIG_ARM_LPAE + a few drivers + minor patches for driver issues.
gdb segfaults on startup, strace shows this:
openat(AT_FDCWD, "/usr/lib/guile/3.0/ccache/ice-9/psyntax-pp.go", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 11
_llseek(11, 0, [477309], SEEK_END) = 0
mmap2(NULL, 477309, PROT_READ, MAP_PRIVATE, 11, 0) = 0xb378a000
close(11) = 0
mprotect(0xb37da000, 43512, PROT_READ|PROT_WRITE) = 0
mmap2(NULL, 262144, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb374a000
cacheflush(0xb374a000, 0xb374b000, 0) = -1 EFAULT (Bad address)
cacheflush(0xb374a000, 0xb374b000, 0) = -1 EFAULT (Bad address)
cacheflush(0xb374a000, 0xb374b000, 0) = -1 EFAULT (Bad address)
futex(0xb6f0bb14, FUTEX_WAKE_PRIVATE, 2147483647) = 0
cacheflush(0xb374a000, 0xb374b000, 0) = -1 EFAULT (Bad address)
cacheflush(0xb374a000, 0xb374b000, 0) = -1 EFAULT (Bad address)
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x4} ---
guile is apparently a scripting language with JIT compiler, disabling
JIT resolves the crash, so cacheflush is a big suspect at this point.
No apparent valid reason for those failures and no recent changes to
cacheflush handling that I see, so I searched for commits touching LPAE
(perhaps the most uncommon part of my system) and I quickly found the
ARM_PAN feature, which claims to monkey with page tables. Hmm...
Disabled ARM_PAN, cacheflush returns 0 now and gdb crashes no more.
So I guess it looks like there is a problem with this feature, perhaps
a missing "permit user accesss" somewhere?
Thanks,
Michal
More information about the linux-arm-kernel
mailing list