PATCH/RFC: [kdump] fix APIC shutdown sequence

Wed Aug 8 07:38:34 EDT 2007

On Mon, Aug 06, 2007 at 05:08:05PM +0200, Martin Wilck wrote:
> PATCH/RFC: [kdump] fix APIC shutdown sequence
> 
> This patch fixes a problem that we have encountered
> with kdump under high I/O load on some machines.
> The machines showing the errors have an Intel ICH7
> chip set with a 6702PXH  PCI Express-to-PCI Bridge
> (8086:032c) containing an IO-APIC.
> 
> The bug symptom is that certain controllers connected
> to the 6702PXH bridge wouldn't receive any IRQs in the
> kdump kernel. In the error case (which is about 20% of
> all cases) the IRR bit of the IO-APIC pin for that
> controller is always set after the start of the kdump
> kernel, indicating an IRQ in progress. We haven't found
> a way to recover from this situation when it has once
> occured, except for a system reset.
> 
> The error is caused by IRQs arriving while the APIC
> subsystem is deactivated in machine_crash_shutdown().
> 
> Apparently, the IO-APIC gets stuck if it sends an IRQ
> message to a Local APIC and never receives an EOI for that
> message. This can have several possible reasons:
> 

Got this oops while testing your patch when I did
"echo c > /proc/sysrq-trigger"

[root at llm37 ~]# SysRq : Trigger a crashdump
Unable to handle kernel NULL pointer dereference at 0000000000000000 RIP:
 [<0000000000000000>]
PGD 22229b067 PUD 0
Oops: 0000 [1] SMP
CPU 0
Modules linked in:
Pid: 2947, comm: klogd Not tainted 2.6.23-rc1-apic-issue #2
RIP: 0010:[<0000000000000000>]  [<0000000000000000>]
RSP: 0018:ffffffff807daf50  EFLAGS: 00010046
RAX: 0000000000000000 RBX: ffffffff8073f480 RCX: ffffffff807ddea0
RDX: ffffffff807717c0 RSI: ffffffff8073f480 RDI: 0000000000000000
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: ffffffff807ddd28 R11: 0000000000000150 R12: 0000000000000000
R13: ffffffff8073f4d0 R14: 000000000000000c R15: 0000000000000000
FS:  00002b204ea686f0(0000) GS:ffffffff8073c000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000000 CR3: 0000000223e9e000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process klogd (pid: 2947, threadinfo ffff81021b7b8000, task ffff8102251a27b0)
Stack:  ffffffff8025afd0 0000000000000046 ffffffff807dddf8 0000000000000000
 0000000000000030 0000000000000000 ffffffff8020e1ae ffffffff807d15c8
 0000000000000000 ffffffff807dde20 0000000000000000 ffffffff807ddee8
Call Trace:
 <IRQ>  [<ffffffff8025afd0>] handle_edge_irq+0x5c/0x127
 [<ffffffff8020e1ae>] do_IRQ+0xf1/0x15f
 [<ffffffff8020c191>] ret_from_intr+0x0/0xa
 <EOI>  <NMI>  [<ffffffff80351ae3>] __delay+0x6/0x10
 [<ffffffff8021e328>] crash_nmi_callback+0x4b/0x77
 [<ffffffff80557075>] notifier_call_chain+0x29/0x4c
 [<ffffffff80249a11>] notify_die+0x2d/0x34
 [<ffffffff805555a3>] default_do_nmi+0x55/0x197
 [<ffffffff80555f0e>] do_nmi+0x2e/0x44
 [<ffffffff805553af>] nmi+0x7f/0x90
 [<ffffffff8022e00a>] find_busiest_group+0x20e/0x6b3
 <<EOE>>  [<ffffffff80552d13>] __sched_text_start+0x1cb/0x5ba
 [<ffffffff80282294>] do_sync_write+0xc9/0x10c
 [<ffffffff80235c32>] do_syslog+0x11d/0x3a9
 [<ffffffff80246a03>] autoremove_wake_function+0x0/0x2e
 [<ffffffff802731c6>] free_pages_and_swap_cache+0x73/0x8f
 [<ffffffff802bbeac>] kmsg_read+0x3a/0x44
 [<ffffffff802b5245>] proc_reg_read+0x7e/0x99
 [<ffffffff80282b08>] vfs_read+0xaa/0x132
 [<ffffffff80282ea4>] sys_read+0x45/0x6e
 [<ffffffff8020bc7e>] system_call+0x7e/0x83

Code:  Bad RIP value.
RIP  [<0000000000000000>]
 RSP <ffffffff807daf50>
CR2: 0000000000000000

Thanks
Vivek