[Patch 0/4] Slimdump framework using NT_NOCOREDUMP elf-note

K.Prasad prasad at linux.vnet.ibm.com
Mon Oct 3 03:07:35 EDT 2011


Hi All,
	Please find a set of patches that introduce a 'slimdump'
framework. Details as described below.

Problem
--------
A system configured with kdump, captures the kernel memory
for all types of crashes even when it doesn't make much sense to do so.
For instance, system crashes triggered due to hardware errors don't need
a complete dump of the memory for investigation.

In the case of crashes triggered by fatal machine check exceptions (MCE)
due to unrecoverable memory errors, it is even dangerous to read the
crashing kernel's memory. When the kexec kernel reads the crashing
kernel's memory, it 'consumes' the data from the faulty memory location,
potentially causing a recursion of faults.

This problem was previously discussed in the kernel community, with a
proposal to leave out kernel memory regions from /proc/vmcore (refer:
mail threads pertaining to
http://article.gmane.org/gmane.linux.kernel/1148266). However there were
suggestions against making this behaviour a kernel policy.

Solution
---------
Since capturing of crashing kernel's memory for hardware error induced
crashes isn't required or is dangerous, we introduce a mechanism to
generate 'slimdump'.

Basically, a new elf-note of type NT_NOCOREDUMP type is added by the
kernel to the vmcore, which is recognised by all tools in the kdump chain
to generate and save a 'slimdump' that contains only elf-headers and the
elf-note section. The elf-note section may be used to add description
about the cause of the error.

The enclosed set of patches make changes to kernel, kexec, makedumpfile
and crash tool to make them recognise the NT_NOCOREDUMP elf-note and
generate a 'slimdump'. Also, fatal MCEs in the kernel is turned into a
consumer of the slimdump mechanism to prevent collection of normal
kdump.

Alternatively, the user has an option (through suitable makedumpfile or
kdump configuration options) to collect the complete vmcore or to
extract the 'dmesg' from /proc/vmcore.

Screen logs
-------------
# mce-inject ~/mce/mce-test/cases/soft-inj/panic_ucr/data/srar_over
[ 4934.748416] [Hardware Error]: CPU 0: Machine Check Exception: 6 Bank
2: f580000000000000
[ 4934.749079] [Hardware Error]: RIP 73:<000000001eadbabe> 
[ 4934.749079] [Hardware Error]: TSC ef029a23417 ADDR 1234 
[ 4934.749079] [Hardware Error]: PROCESSOR 0:663 TIME 1317149322 SOCKET
0 APIC 0
[ 4934.749079] [Hardware Error]: Run the above through 'mcelog --ascii'
[ 4934.749079] [Hardware Error]: Machine check: Overflowed uncorrected
[ 4934.749079] Kernel panic - not syncing: Fatal machine check on
current CPU
[ 4934.749079] Pid: 1379, comm: mce-inject Tainted: G   M
3.1.0-rc4.slimdump+ #34
[ 4934.749079] Call Trace:
[ 4934.749079]  [<ffffffff81084922>] panic+0xbc/0x1cf
[ 4934.749079]  [<ffffffff810858ff>] ? printk+0x6c/0x6e
[ 4934.749079]  [<ffffffff8104c43b>] mce_panic+0x187/0x1a4
[ 4934.749079]  [<ffffffff8104d525>] do_machine_check+0x5ec/0x6c3
[ 4934.749079]  [<ffffffff8104e4e1>] raise_exception+0x5c/0x84
[ 4934.749079]  [<ffffffff8104e5e9>] raise_local+0x5a/0xcc
[ 4934.749079]  [<ffffffff8104e8ee>] mce_write+0x218/0x24e
[ 4934.749079]  [<ffffffff8115abee>] vfs_write+0xb0/0x108
[ 4934.749079]  [<ffffffff8115ad0a>] sys_write+0x4c/0x71
[ 4934.749079]  [<ffffffff815bf12b>] system_call_fastpath+0x16/0x1b
[    0.817861] kvm: no hardware support
..............
................
.................
# ls
vmcore
# ls -lh vmcore
-r-------- 1 root root 1.8G Sep 27 13:20 vmcore
# ~/makedumpfile.slimdump/makedumpfile vmcore vmcore.makedumpfile.review
The kernel version is not supported.
The created dumpfile may be incomplete.
Copying data                       : [100 %] 

The dumpfile is saved to vmcore.makedumpfile.review.

makedumpfile Completed.
# ls -lh vmcore.makedumpfile.review
-rw------- 1 root root 3.9K Sep 28 01:40 vmcore.makedumpfile.review
# eu-readelf -n
vmcore.makedumpfile.review

Note segment of 3592 bytes at offset 0x158:
  Owner          Data size  Type
  CORE                 336  PRSTATUS
    info.si_signo: 0, info.si_code: 0, info.si_errno: 0, cursig: 0
    sigpend: <>
..........
.............
.........
    NUMBER(PG_private)=11
    NUMBER(PG_swapcache)=16
    SYMBOL(phys_base)=ffffffff81a0e010
    SYMBOL(init_level4_pgt)=ffffffff81a06000
    SYMBOL(node_data)=ffffffff81b70b80
    LENGTH(node_data)=512
    CRASHTIME=1317621133

  PANIC_MCE             49  <unknown>: 21
# crash -S ~/linux-2.6.slimdump/System.map ~/linux-2.6.slimdump/vmlinux vmcore.makedumpfile.review

crash 5.1.8
Copyright (C) 2002-2011  Red Hat, Inc.
Copyright (C) 2004, 2005, 2006  IBM Corporation
Copyright (C) 1999-2006  Hewlett-Packard Co
Copyright (C) 2005, 2006  Fujitsu Limited
Copyright (C) 2006, 2007  VA Linux Systems Japan K.K.
Copyright (C) 2005  NEC Corporation
Copyright (C) 1999, 2002, 2007  Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002  Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public
License,
and you are welcome to change it and/or distribute copies of it under
certain conditions.  Enter "help copying" to see the conditions.
This program has absolutely no warranty.  Enter "help warranty" for
details.
 
crash: overriding /boot/System.map with
/home/prasadkr/linux-2.6.slimdump/System.map
"System crashed due to a hardware memory error. No coredump available."
Nocoredump Reason: PANIC_MCE
crash: Elf64_Phdr pointer: 1c46170  ELF header end: 1c46130

-------
Thanks,
K.Prasad




More information about the kexec mailing list