panic kexec broken on ARM64?

David Woodhouse dwmw2 at
Thu Aug 2 08:49:54 PDT 2018

On Thu, 2018-07-05 at 11:19 +0100, Marc Zyngier wrote:
> >> The criteria is "this irqchip requires a reset to be safely used in the
> >> secondary kernel". This is a judgement call from the person writing the
> >> driver.
>> > This doesn't tell me anything more than "do it if you need it."
> > So let me ask you in other words.
> > Does gic driver need to provide a reset function?
> > Whether yes or no, why do you think so?
> Because I know the architecture and I can assess that it needs it. Case
> in point: The RDs have memory tables. kexec without disabling LPIs, and
> you end-up with memory corruption.
> Sorry, but there is no magic bullet. You have to understand what you're
> doing.

Remember, kexec and kdump are subtly different things.

In the case of an orderly kexec, sure you can go walking chains of
interrupt controllers (and other devices) and nicely quiescing them.

In the kdump case it's different. You really want as few instructions
as possible between realising you're going to panic, and entering the
kdump kernel. You NMI¹ all the other cores to dump their state, and
just GTFO.

In the kdump case you also aren't *reusing* the memory, which means
that existing memory tables which are being accessed by hardware
shouldn't be an issue. You can let the second kernel reset it all from
a controlled and not-already-panicking environment.


¹ Oops no NMI. Doh.
