PATCH/RFC: [kdump] fix APIC shutdown sequence

Eric W. Biederman ebiederm at xmission.com
Wed Aug 8 17:25:53 EDT 2007


Martin Wilck <martin.wilck at fujitsu-siemens.com> writes:

> Vivek Goyal wrote:
>
>> Got this oops while testing your patch when I did
>> "echo c > /proc/sysrq-trigger"
>
> That's bad :-(
>
> ...
>> Unable to handle kernel NULL pointer dereference at 0000000000000000 RIP:
>  [<0000000000000000>]
>>  <IRQ>  [<ffffffff8025afd0>] handle_edge_irq+0x5c/0x127
>>  [<ffffffff8020e1ae>] do_IRQ+0xf1/0x15f
>>  [<ffffffff8020c191>] ret_from_intr+0x0/0xa
>>  <EOI>  <NMI>  [<ffffffff80351ae3>] __delay+0x6/0x10
>>  [<ffffffff8021e328>] crash_nmi_callback+0x4b/0x77
>
> Obviously, it the oops occurs after the IRQs are re-enabled in
> crash_nmi_callback().
> It appears that desc->chip->mask (in mask_ack_irq(), kernel/irq/chip.c:462) was
> NULL.
> I have no idea how my patch could cause that.
>
> This proves that re-enabling IRQs in this situation is too dangerous :-(.
> I really did hundreds of dumps in different ways (echo c>/proc/sysrq-trigger,
> forcing Oops and panic(), actually hititng Alt-Sysrq) and this never happened to
> me.

This reminds me. 

There is currently an issue that no has finished converting
the lapic to the new generic irq way.  Which can have this effect.

Eric



More information about the kexec mailing list