[tip:x86/debug] x86/kdump: No need to disable ioapic/ lapic in crash path

tip-bot for Don Zickus dzickus at redhat.com
Sat Feb 11 18:09:47 EST 2012


Commit-ID:  d9bc9be89629445758670220787683e37c93f6c1
Gitweb:     http://git.kernel.org/tip/d9bc9be89629445758670220787683e37c93f6c1
Author:     Don Zickus <dzickus at redhat.com>
AuthorDate: Thu, 9 Feb 2012 16:53:41 -0500
Committer:  Ingo Molnar <mingo at elte.hu>
CommitDate: Sat, 11 Feb 2012 15:38:53 +0100

x86/kdump: No need to disable ioapic/lapic in crash path

A customer of ours noticed when their machine crashed, kdump did
not work but hung instead.  Using their firmware dumping
solution they grabbed a vmcore and decoded the stacks on the
cpus.  What they noticed seemed to be a rare deadlock with the
ioapic_lock.

 CPU4:
 machine_crash_shutdown
 -> machine_ops.crash_shutdown
    -> native_machine_crash_shutdown
       -> kdump_nmi_shootdown_cpus ------> Send NMI to other CPUs
       -> disable_IO_APIC
          -> clear_IO_APIC
             -> clear_IO_APIC_pin
                -> ioapic_read_entry
                   -> spin_lock_irqsave(&ioapic_lock, flags)
                   ---Infinite loop here---

 CPU0:
 do_IRQ
 -> handle_irq
    -> handle_edge_irq
        -> ack_apic_edge
           -> move_native_irq
               -> mask_IO_APIC_irq
                  -> mask_IO_APIC_irq_desc
                     -> spin_lock_irqsave(&ioapic_lock, flags)
                     ---Receive NMI here after getting spinlock---
                        -> nmi
                           -> do_nmi
                              -> crash_nmi_callback
                              ---Infinite loop here---

The problem is that although kdump tries to shutdown minimal
hardware, it still needs to disable the IO APIC.  This requires
spinlocks which may be held by another cpu.  This other cpu is
being held infinitely in an NMI context by kdump in order to
serialize the crashing path.  Instant deadlock.

Eric brought up a point that because the boot code was
restructured we may not need to disable the io apic any more in
the crash path.  The original concern that led to the
development of disable_IO_APIC, was that the jiffies calibration
on boot up relied on the PIT timer for reference.  Access to the
PIT required 8259 interrupts to be working.  This wouldn't work
if the ioapic needed to be configured.  So on panic path, the
ioapic was reconfigured to use virtual wire mode to allow the 8259 to passthrough.

Those concerns don't hold true now, thanks to the jiffies
calibration code not needing the PIT.  As a result, we can
remove this call and simplify the locking needed in the panic
path.

The same work allowed us to remove the need to disable the local
apic on shutdown too.  This should allow us to jump to the
second a little faster.

I tested kdump on an Ivy Bridge platform, a Pentium4 and an old
athlon that did not have an ioapic.  All three were successful.

I also tested using lkdtm that would use jprobes to panic the
system when entering do_IRQ.  The idea was to see how the system
reacted with an interrupt pending in the second kernel.  My
core2 quad successfully kdump'd 3 times in a row with no issues.

v2: removed the disable lapic code too

Signed-off-by: Don Zickus <dzickus at redhat.com>
Acked-by: Eric W. Biederman <ebiederm at xmission.com>
Cc: kexec-list <kexec at lists.infradead.org>
Cc: Vivek Goyal <vgoyal at redhat.com>
Cc: Linus Torvalds <torvalds at linux-foundation.org>
Cc: Andrew Morton <akpm at linux-foundation.org>
Cc: Yinghai Lu <yinghai at kernel.org>
Signed-off-by: Ingo Molnar <mingo at elte.hu>
---
 arch/x86/kernel/crash.c |    8 --------
 1 files changed, 0 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
index 13ad899..571f253 100644
--- a/arch/x86/kernel/crash.c
+++ b/arch/x86/kernel/crash.c
@@ -54,16 +54,12 @@ static void kdump_nmi_callback(int cpu, struct pt_regs *regs)
 	 */
 	cpu_emergency_vmxoff();
 	cpu_emergency_svm_disable();
-
-	disable_local_APIC();
 }
 
 static void kdump_nmi_shootdown_cpus(void)
 {
 	in_crash_kexec = 1;
 	nmi_shootdown_cpus(kdump_nmi_callback);
-
-	disable_local_APIC();
 }
 
 #else
@@ -95,10 +91,6 @@ void native_machine_crash_shutdown(struct pt_regs *regs)
 	cpu_emergency_vmxoff();
 	cpu_emergency_svm_disable();
 
-	lapic_shutdown();
-#if defined(CONFIG_X86_IO_APIC)
-	disable_IO_APIC();
-#endif
 #ifdef CONFIG_HPET_TIMER
 	hpet_disable();
 #endif



More information about the kexec mailing list