SError Interrupt on CPU0, code 0xbf000000 makes kernel panic

Robin Murphy robin.murphy at arm.com
Thu Mar 24 06:16:06 PDT 2022


On 2022-03-24 12:10, Joakim Tjernlund wrote:
> We have a custom SOC, CPU A53, that when an app accesses non existing address space reports:
> # > devmem 0x20000000 w 0x1000 #this will open /dev/mem and write
>   
> [   37.570886] SError Interrupt on CPU0, code 0xbf000000 -- SError
> [   37.571974] CPU: 0 PID: 72 Comm: devmem Not tainted 5.15.26-g18447c6fff6f-dirty #26
> [   37.573150] Hardware name: infinera,xr (DT)
> [   37.573599] pstate: 60000010 (nZCv q A32 LE aif -DIT -SSBS)
> [   37.574705] pc : 000000000098775c
> [   37.575063] lr : 0000000000986918
> [   37.575392] sp : 00000000ffd140a8
> [   37.575725] x12: 0000000000a36c10
> [   37.576443] x11: 0000000000000000 x10: 0000000000000000 x9 : 0000000000000020
> [   37.577872] x8 : 00000000ffd141c0 x7 : 00000000ffd14104 x6 : 0000000000986c9c
> [   37.579278] x5 : 000000000000001f x4 : 0000000000000004 x3 : 0000000000a37020
> [   37.580635] x2 : 0000000000000003 x1 : 0000000000001000 x0 : 0000000000000000
> [   37.582164] Kernel panic - not syncing: Asynchronous SError Interrupt
> [   37.582685] Kernel Offset: disabled
> [   37.582932] CPU features: 0x00001001,20000842
> [   37.583509] Memory Limit: none
> [   37.630058] ---[ end Kernel panic - not syncing: Asynchronous SError Interrupt ]---
> 
> and the kernel panics. This is a surprise as I expected the app to just be killed bus a SIGBUS.
> Is this what to expect?
> I see that kernel looks for the RAS extension but we don't have that.
> 
> Can anything be done not to panic the kernel for such accesses?

No. The error comes back to the CPU in an unattributable manner, so all 
it knows is that *something*, at some point in the past, went 
catastrophically wrong. Saying "this is fine..." and carrying on 
regardless isn't really viable. IIRC the RAS extension places 
constraints on the delivery of async SError such that it's slightly more 
possible to do something with, but without that all bets are off.

> Can one build a som sort of blacklisted address spaces which the MMU will block?

Sure, just configure the kernel with CONFIG_DEVMEM=n and it should never 
access anything invalid.

I'm not even entirely joking there - even for address ranges that the 
kernel *does* know about, you can still SError or deadlock by poking at 
something that's currently clock-gated or powered off, or lose coherency 
and cause corruption by accessing memory with the wrong attributes; at 
worst writing the wrong thing to the wrong place may even physically 
damage the hardware.

Robin.



More information about the linux-arm-kernel mailing list