[PATCH 1/2] boot: ignore early NMIs

Fernando Luis Vázquez Cao fernando at oss.ntt.co.jp
Mon Mar 12 01:43:42 EDT 2012


On 03/10/2012 05:52 AM, H. Peter Anvin wrote:

> Is there a reason to not just simply block these NMIs during the kexec
> sequence?
Ok, some background:

In the reboot path to the kdump kernel we disable local interrupts
and the APICs in native_machine_crash_shutdown() and reset the IDT
in machine_kexec(), which leaves an in valid IDT installed.

However, disabling the I/O APIC involves taking a lock, which in
the event of a crash can is racy and can lead to a deadlock. To
solve this issue Don wrote a patch that left the I/O APICs and
the LAPIC of the crashing CPU untouched in the kdump reboot path,
but this seemed to cause mysterious reboots in some systems.
It turned out that an NMI coming from the perf based hardlockup
detector was causing the system to triple fault. If a NMI happens
to arrive in the window between the invalidation of the IDT in
machine_kexec() and the configuration of the final IDT we will be
in big trouble. In particular, the system will either triple fault
or halt, depending on whether the NMI arrived before or after
installing the early IDT.

To tackle this issue we can either stop the hardlockup detector
or disable the LAPIC (the NMIs needed by x86's hardlockup detector
are generated using performance counters in the LAPIC), leaving
the I/O APICs untouched. The second is simpler and I think it
is the approach Don took to fix this issue in RHEL kernels.

Unfortunately, this is not enough, we are still exposed to external
NMIs not routed through the LAPIC. In other words, we have to make
sure that we always have and IDT that is able to handle NMIs without
seemingly random reboots and lockups. To achieve this goal we need
to fix machine_kexec() and the early IDT handlers. The current patch
set takes care of the latter.

- Fernando




More information about the kexec mailing list