[RFC][PATCH] Fix kexec abort due to IPI from panic().

Eric W. Biederman ebiederm at xmission.com
Mon Sep 27 12:59:28 EDT 2010


Seiji Aguchi <seiji.aguchi at hds.com> writes:

> Hi Eric,
>
> This is a patch which makes kmsg_dump() non-blocking.
> Please give me your comments and suggestions.
>
> I improved it as follows.
>
> (1) Improvement of dump_list_lock
>     (1-1) I changed dump_list to RCU for deleting dump_list_lock in kmsg_dump().
>     (1-2) I moved kmsg_dump(KMSG_DUMP_KEXEC) behind machine_crash_shutdown() 
>           for avoiding concurrent execution of dump_list functions.
>     (1-3) I also moved kmsg_dump(KMSG_DUMP_PANIC) behind smp_send_stop() for the 
>           same reason.
>  
> (2) Improvement of logbuf_lock
>     I added spinlock_init(&logbuf_lock) when executing kmsg_dump() in kexec or panic path
>     for preventing dead lock.
>
> We can delete blocking kmsg_dump call in crash_kexec and panic path.

This looks better, but it still gives me the willies.

I tried tracing through the ramoops code to see if there were anything
else that could block, but I couldn't make it through do_gettimeofday.

I couldn't even make it that far with the mtd oops tracer.

The fact that the code is exported and modular doesn't make me feel
safe because there have been people in the past who have asked for an
notifier on crash so they could do stupid things when the kernel is
broken.

The fact that this wasn't noticed until we actually had a hang, doesn't
give me an especially great feeling about long term stability.

Most of all I don't see the use case of calling kmsg_dump when you have
kexec on panic setup to do the same thing.  Having kmsg_dump not on
the kexec on panic code path would let me sleep much easier at night.

Then there is the historical side of this.  Through many failed attempts
it has been show that dumpers in the kernel are fragile beasts that work
up until you actually have a real world failure and then they let you
down.  Kexec on panic is better as it works 65% or so of the time,
and definitely won't corrupt your bits if it fails.  I don't see what
makes kmsg_dump better than all of the past failed and useless kernel
dumpers.

Eric



More information about the kexec mailing list