[RFC] Kdump and memory error handling

Vivek Goyal vgoyal at redhat.com
Wed May 4 23:02:56 EDT 2011

On Wed, May 04, 2011 at 10:39:14PM +0200, Andi Kleen wrote:
> > Any thoughts/suggestions?
> My old attempts to solve this are
> Don't dump on MCE:
> http://git.kernel.org/?p=linux/kernel/git/ak/linux-mce-2.6.git;a=shortlog;h=refs/heads/mce/xpanic
> Handle dumps of corrupted memory regresions:
> http://git.kernel.org/?p=linux/kernel/git/ak/linux-mce-2.6.git;a=shortlog;h=refs/heads/mce/crashdump

This idea of disabling mce temporarily sounds interesting. 

The slim dump giving access to log buffers makes most sense to me. Why
not leave it to user space to filter out only log buffers. So if a 
crash happens due to MCE, we can probably append an ELF note section
to vmcore and may be user space filtering utitliy (makedumpfile) can
extract and save only log portion of dump if it is an MCE triggered crash.

Of course this needs to be coupled with Andi's patch of disabling mce
temporarily so that makedumpfile does not induce another crash.

On a side note, can we just save log buf in NVRAM area and access later
using pstore (by tony luck) and if we can detect that system has that
NVRAM capability then skip kdump or something like that.


More information about the kexec mailing list