[PATCH] kdump: Fix crash_kexec - smp_send_stop race in panic
Andrew Morton
akpm at linux-foundation.org
Mon Oct 31 06:39:48 EDT 2011
On Mon, 31 Oct 2011 10:57:16 +0100 Michael Holzheu <holzheu at linux.vnet.ibm.com> wrote:
> > Should this be done earlier in the function? As it stands we'll have
> > multiple CPUs scribbling on buf[] at the same time and all trying to
> > print the same thing at the same time, dumping their stacks, etc.
> > Perhaps it would be better to single-thread all that stuff
>
> My fist patch took the spinlock at the beginning of panic(). But then
> Eric asked, if it wouldn't be better to get both panic printk's and I
> agreed.
Hm, why? It will make a big mess.
> > Also... this patch affects all CPU architectures, all configs, etc.
> > So we're expecting that every architecture's smp_send_stop() is able to
> > stop a CPU which is spinning in spin_lock(), possibly with local
> > interrupts disabled. Will this work?
>
> At least on s390 it will work. If there are architectures that can't
> stop disabled CPUs then this problem is already there without this
> patch.
>
> Example:
>
> 1. 1st CPU gets lock X and panics
> 2. 2nd CPU is disabled and gets lock X
(irq-disabled)
> 3. 1st CPU calls smp_send_stop()
> -> 2nd CPU loops disabled and can't be stopped
Well OK. Maybe some architectures do have this problem - who would
notice? If that is the case, we just made the failure cases much more
common. Could you check, please?
More information about the kexec
mailing list