[PATCH] kdump: Fix crash_kexec - smp_send_stop race in panic

Andrew Morton akpm at linux-foundation.org
Mon Oct 31 06:39:48 EDT 2011


On Mon, 31 Oct 2011 10:57:16 +0100 Michael Holzheu <holzheu at linux.vnet.ibm.com> wrote:

> > Should this be done earlier in the function?  As it stands we'll have
> > multiple CPUs scribbling on buf[] at the same time and all trying to
> > print the same thing at the same time, dumping their stacks, etc. 
> > Perhaps it would be better to single-thread all that stuff
> 
> My fist patch took the spinlock at the beginning of panic(). But then
> Eric asked, if it wouldn't be better to get both panic printk's and I
> agreed.

Hm, why?  It will make a big mess.

> > Also...  this patch affects all CPU architectures, all configs, etc. 
> > So we're expecting that every architecture's smp_send_stop() is able to
> > stop a CPU which is spinning in spin_lock(), possibly with local
> > interrupts disabled.  Will this work?
> 
> At least on s390 it will work. If there are architectures that can't
> stop disabled CPUs then this problem is already there without this
> patch.
> 
> Example:
> 
> 1. 1st CPU gets lock X and panics
> 2. 2nd CPU is disabled and gets lock X

(irq-disabled)

> 3. 1st CPU calls smp_send_stop()
>    -> 2nd CPU loops disabled and can't be stopped

Well OK.  Maybe some architectures do have this problem - who would
notice?  If that is the case, we just made the failure cases much more
common.  Could you check, please?




More information about the kexec mailing list