[Crash-utility] Re: [PATCH 0/2] vmcoreinfo support for dump filtering #2
anderson at redhat.com
Tue Sep 11 10:03:43 EDT 2007
Vivek Goyal wrote:
> On Mon, Sep 10, 2007 at 11:35:21AM -0700, Randy Dunlap wrote:
>>On Fri, 7 Sep 2007 17:57:46 +0900 Ken'ichi Ohmichi wrote:
>>>I released a new makedumpfile (version 1.2.0) with vmcoreinfo support.
>>>I updated the patches for linux and kexec-tools.
>>>[1/2] [linux-2.6.22] Add vmcoreinfo
>>> The patch is for linux-2.6.22.
>>> The patch adds the vmcoreinfo data. Its address and size are output
>>> to /sys/kernel/vmcoreinfo.
>>>[2/2] [kexec-tools] Pass vmcoreinfo's address and size
>>> The patch is for kexec-tools-testing-20070330.
>>> kexec command gets the address and size of the vmcoreinfo data from
>>> /sys/kernel/vmcoreinfo, and passes them to the second kernel through
>>> ELF header of /proc/vmcore. When the second kernel is booting, the
>>> kernel gets them from the ELF header and creates vmcoreinfo's PT_NOTE
>>> segment into /proc/vmcore.
>>When using the vmcoreinfo patches, what tool(s) are available for
>>analyzing the vmcore (dump) file? E.g., lkcd or crash or just gdb?
>>gdb works for me, but I tried to use crash (4.0-4.6 from
>>http://people.redhat.com/anderson/) and crash complained:
>>crash: invalid kernel virtual address: 0 type: "cpu_pda entry"
>>Should crash work, or does it need to be modified?
> Hi Randy,
> Crash should just work. It might broken on latest kernel. Copying it
> to crash-utility mailing list. Dave will be able to tell us better.
>>This is on a 2.6.23-rc3 kernel with vmcoreinfo patches and a dump file
>>with -l 31 (dump level 31, omitting all possible pages).
There's always the possibility that something crucial (to the crash
utility) has changed in the upstream kernel; that's just the nature
of the beast.
In this case, crash is reading this set of per-cpu pointers:
struct x8664_pda *_cpu_pda[NR_CPUS] __read_mostly;
and for each one, it then reads the x8664_pda data structure
that it points to -- but finds a NULL. It's possible that it
has incorrectly determined the number of x8664_pda structures
(cpus) that exist. Or less likely, the array contents were read
as zeroes from the dumpfile.
Anyway, with any initialization-time failure, it's usually helpful
to invoke crash with the "-d7" (debug level) argument, as in:
$ crash -d7 vmlinux vmcore
That will display information re: every read made to the dumpfile.
In this case, normally you would see, for each cpu, a read of the
individual 8-byte address from the array, and then based upon what
it read, the subsequent read of the whole 128-byte data structure:
<readmem: ffffffff8042d9c0, KVADDR, "_cpu_pda addr", 8, (FOE), 7fbffff210>
<readmem: ffffffff80406000, KVADDR, "cpu_pda entry", 128, (FOE), 937680>
CPU0: level4_pgt: 200000010 data_offset: ffff8100899c1000
<readmem: ffffffff8042d9c8, KVADDR, "_cpu_pda addr", 8, (FOE), 7fbffff210>
<readmem: ffff81003ff027c0, KVADDR, "cpu_pda entry", 128, (FOE), 937680>
CPU1: level4_pgt: 200000010 data_offset: ffff8100899c9000
<readmem: ffffffff8042d9d0, KVADDR, "_cpu_pda addr", 8, (FOE), 7fbffff210>
<readmem: ffff81003ff19e40, KVADDR, "cpu_pda entry", 128, (FOE), 937680>
CPU2: level4_pgt: 200000010 data_offset: ffff8100899d1000
<readmem: ffffffff8042d9d8, KVADDR, "_cpu_pda addr", 8, (FOE), 7fbffff210>
<readmem: ffff81003ff19640, KVADDR, "cpu_pda entry", 128, (FOE), 937680>
CPU3: level4_pgt: 200000010 data_offset: ffff8100899d9000
<readmem: ffffffff8042d9e0, KVADDR, "_cpu_pda addr", 8, (FOE), 7fbffff210>
<readmem: ffffffff80406200, KVADDR, "cpu_pda entry", 128, (FOE), 937680>
From that data structure it grabs the level4_pgt and data_offset
fields for subsequent use. So in your case, it should show how
many (if any) of the x8664_pda structures it read before encountering
a NULL pointer in one of the array entries.
More information about the kexec