PATCH/RFC: [kdump] fix APIC shutdown sequence

Martin Wilck martin.wilck at fujitsu-siemens.com
Wed Aug 8 14:07:13 EDT 2007


Vivek Goyal wrote:

> Got this oops while testing your patch when I did
> "echo c > /proc/sysrq-trigger"

That's bad :-(

...
> Unable to handle kernel NULL pointer dereference at 0000000000000000 RIP:
 [<0000000000000000>]
>  <IRQ>  [<ffffffff8025afd0>] handle_edge_irq+0x5c/0x127
>  [<ffffffff8020e1ae>] do_IRQ+0xf1/0x15f
>  [<ffffffff8020c191>] ret_from_intr+0x0/0xa
>  <EOI>  <NMI>  [<ffffffff80351ae3>] __delay+0x6/0x10
>  [<ffffffff8021e328>] crash_nmi_callback+0x4b/0x77

Obviously, it the oops occurs after the IRQs are re-enabled in crash_nmi_callback().
It appears that desc->chip->mask (in mask_ack_irq(), kernel/irq/chip.c:462) was NULL.
I have no idea how my patch could cause that.

This proves that re-enabling IRQs in this situation is too dangerous :-(.
I really did hundreds of dumps in different ways (echo c>/proc/sysrq-trigger,
forcing Oops and panic(), actually hititng Alt-Sysrq) and this never happened to
me.

Martin

-- 
Martin Wilck
PRIMERGY System Software Engineer
FSC IP ESP DE6

Fujitsu Siemens Computers GmbH
Heinz-Nixdorf-Ring 1
33106 Paderborn
Germany

Tel:			++49 5251 8 15113
Fax:			++49 5251 8 20409
Email:			mailto:martin.wilck at fujitsu-siemens.com
Internet:		http://www.fujitsu-siemens.com
Company Details:	http://www.fujitsu-siemens.com/imprint.html



More information about the kexec mailing list