SError Interrupt on CPU0, code 0xbf000000 makes kernel panic

Joakim Tjernlund Joakim.Tjernlund at infinera.com
Thu Mar 24 08:42:49 PDT 2022


On Thu, 2022-03-24 at 15:25 +0000, Marc Zyngier wrote:
> On Thu, 24 Mar 2022 15:11:42 +0000,
> Joakim Tjernlund <Joakim.Tjernlund at infinera.com> wrote:
> > 
> > On Thu, 2022-03-24 at 15:05 +0000, Robin Murphy wrote:
> 
> > > Well, except when it is... try that on a Qualcomm SoC where the EL2
> > > firmware will trap you and reset the system before you even know you've 
> > > done anything wrong. If you know enough to know that an error triggered 
> > > by accessing some address is truly benign, you know enough to avoid 
> > > making that access in the first place.
> > 
> > of course the error will be dealt with but why make bug finding
> > harder than it has to be?
> 
> Maybe that was not clear enough from our earlier replies. Let me try
> again.
> 
> There is *nothing* more the kernel can do. We don't even know what
> caused the access (read, write, earthquake or foreign power invasion).
> 
> By the time we get the SError interrupt, we could well be running
> something altogether different because all of that is totally
> asynchronous *by nature*. You're just lucky that you get the response
> quickly enough that the kernel is still running the offending
> userspace.

I worked ppc earlier and there am used to get an exception(MachineCheck) with PC and Data address
for similar cases and can usually pass that on to user space as a SIGBUS and kernel moves along.

Seems ARM works very differently and pulls the plug directly, just finding it odd though.

 Jocke


More information about the linux-arm-kernel mailing list