[Crash-utility] Re: [PATCH 0/2] vmcoreinfo support for dump filtering #2

Dave Anderson anderson at redhat.com
Tue Sep 11 14:12:00 EDT 2007


Randy Dunlap wrote:
>
> I have the vmcoreinfo patch applied.
> Kernel is 2.6.23-rc3.
> 
> The crash debug output is below.  Please let me know if you'd like
> me to test without the vmcoreinfo patch or anything else.
> 
> ---
> 
> crash 4.0-4.6
> Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007  Red Hat, Inc.
> Copyright (C) 2004, 2005, 2006  IBM Corporation
> Copyright (C) 1999-2006  Hewlett-Packard Co
> Copyright (C) 2005, 2006  Fujitsu Limited
> Copyright (C) 2006, 2007  VA Linux Systems Japan K.K.
> Copyright (C) 2005  NEC Corporation
> Copyright (C) 1999, 2002, 2007  Silicon Graphics, Inc.
> Copyright (C) 1999, 2000, 2001, 2002  Mission Critical Linux, Inc.
> This program is free software, covered by the GNU General Public License,
> and you are welcome to change it and/or distribute copies of it under
> certain conditions.  Enter "help copying" to see the conditions.
> This program has absolutely no warranty.  Enter "help warranty" for details.
>  
> vmcore_data: 
>                   flags: c0 (KDUMP_LOCAL|KDUMP_ELF64)
>                    ndfd: 3
>                     ofp: 322af48760
>             header_size: 1580
>    num_pt_load_segments: 4
>      pt_load_segment[0]:
>             file_offset: 62c
>              phys_start: 200000
>                phys_end: bda000
>               zero_fill: 0
>      pt_load_segment[1]:
>             file_offset: 9da62c
>              phys_start: 0
>                phys_end: a0000
>               zero_fill: 0
>      pt_load_segment[2]:
>             file_offset: a7a62c
>              phys_start: 100000
>                phys_end: 1000000
>               zero_fill: 0
>      pt_load_segment[3]:
>             file_offset: 197a62c
>              phys_start: 5000000
>                phys_end: 3ffc0000
>               zero_fill: 0
>              elf_header: 21b2c70
>                   elf32: 0
>                 notes32: 0
>                  load32: 0
>                   elf64: 21b2c70
>                 notes64: 21b2cb0
>                  load64: 21b2ce8
>             nt_prstatus: 21b2dc8
>             nt_prpsinfo: 0
>           nt_taskstruct: 0
>             task_struct: 0
>               page_size: 0
>            switch_stack: 0
>          xen_kdump_data: (unused)
>        num_prstatus_notes: 1
>        nt_prstatus_percpu: 00000000021b2dc8
> 
> 
> Elf64_Ehdr:
>                 e_ident: \177ELF
>       e_ident[EI_CLASS]: 2 (ELFCLASS64)
>        e_ident[EI_DATA]: 1 (ELFDATA2LSB)
>     e_ident[EI_VERSION]: 1 (EV_CURRENT)
>       e_ident[EI_OSABI]: 0 (ELFOSABI_SYSV)
>  e_ident[EI_ABIVERSION]: 0
>                  e_type: 4 (ET_CORE)
>               e_machine: 62 (EM_X86_64)
>               e_version: 1 (EV_CURRENT)
>                 e_entry: 0
>                 e_phoff: 40
>                 e_shoff: 0
>                 e_flags: 0
>                e_ehsize: 40
>             e_phentsize: 38
>                 e_phnum: 5
>             e_shentsize: 0
>                 e_shnum: 0
>              e_shstrndx: 0
> Elf64_Phdr:
>                  p_type: 4 (PT_NOTE)
>                p_offset: 344 (158)
>                 p_vaddr: 0
>                 p_paddr: 0
>                p_filesz: 1236 (4d4)
>                 p_memsz: 1236 (4d4)
>                 p_flags: 0 ()
>                 p_align: 0
> Elf64_Phdr:
>                  p_type: 1 (PT_LOAD)
>                p_offset: 1580 (62c)
>                 p_vaddr: ffffffff80200000
>                 p_paddr: 200000
>                p_filesz: 10330112 (9da000)
>                 p_memsz: 10330112 (9da000)
>                 p_flags: 7 (PF_X|PF_W|PF_R)
>                 p_align: 0
> Elf64_Phdr:
>                  p_type: 1 (PT_LOAD)
>                p_offset: 10331692 (9da62c)
>                 p_vaddr: ffff810000000000
>                 p_paddr: 0
>                p_filesz: 655360 (a0000)
>                 p_memsz: 655360 (a0000)
>                 p_flags: 7 (PF_X|PF_W|PF_R)
>                 p_align: 0
> Elf64_Phdr:
>                  p_type: 1 (PT_LOAD)
>                p_offset: 10987052 (a7a62c)
>                 p_vaddr: ffff810000100000
>                 p_paddr: 100000
>                p_filesz: 15728640 (f00000)
>                 p_memsz: 15728640 (f00000)
>                 p_flags: 7 (PF_X|PF_W|PF_R)
>                 p_align: 0
> Elf64_Phdr:
>                  p_type: 1 (PT_LOAD)
>                p_offset: 26715692 (197a62c)
>                 p_vaddr: ffff810005000000
>                 p_paddr: 5000000
>                p_filesz: 989593600 (3afc0000)
>                 p_memsz: 989593600 (3afc0000)
>                 p_flags: 7 (PF_X|PF_W|PF_R)
>                 p_align: 0
> Elf64_Nhdr:
>                n_namesz: 5 ("CORE")
>                n_descsz: 336
>                  n_type: 1 (NT_PRSTATUS)
>                          0000000000000000 0000000000000000 
>                          0000000000000000 0000000000000000 
>                          0000000000002b1a 0000000000000000 
>                          0000000000000000 0000000000000000 
>                          0000000000000000 0000000000000000 
>                          0000000000000000 0000000000000000 
>                          0000000000000000 0000000000000000 
>                          0000000000000006 0000000000000000 
>                          0000000000000063 0000000000000000 
>                          ffff810019769e48 ffffffff806aeda0 
>                          ffff81003d64eac0 ffffffff8023b1df 
>                          ffffffff8023b1df ffff810019769ca8 
>                          0000000000000000 0000000000000000 
>                          ffff8100848a9000 0000000000000000 
>                          0000000000000000 0000000000000292 
>                          ffffffff80260532 0000000000000010 
>                          0000000000000046 ffff810019769d98 
>                          0000000000000018 00002b7fb481df10 
>                          0000000000000000 0000000000000000 
>                          0000000000000000 0000000000000000 
>                          0000000000000000 0000000000000000 
> Elf64_Nhdr:
>                n_namesz: 11 ("VMCOREINFO")
>                n_descsz: 856
>                  n_type: 0 (?)
>                          41454c4552534f00 322e362e323d4553 
>                          41500a3363722d33 343d455a49534547 
>                          424d59530a363930 5f74696e69284c4f 
>                          3d29736e5f737475 6666666666666666 
>                          3036373538363038 284c4f424d59530a 
>                          6c6e6f5f65646f6e 2970616d5f656e69 
>                          666666666666663d 3432633037303866 
>                          4c4f424d59530a30 7265707061777328 
>                          297269645f67705f 666666666666663d 
>                          3030313032303866 4c4f424d59530a30 
>                          2974786574735f28 666666666666663d 
>                          3030393032303866 7028455a49530a30 
>                          0a36393d29656761 6c677028455a4953 
>                          617461645f747369 0a30323734313d29 
>                          6e6f7a28455a4953 0a343230313d2965 
>                          65726628455a4953 3d29616572615f65 
>                          28455a49530a3432 6165685f7473696c 
>                          464f0a36313d2964 6761702854455346 
>                          297367616c662e65 455346464f0a303d 
>                          5f2e656761702854 383d29746e756f63 
>                          2854455346464f0a 70616d2e65676170 
>                          34323d29676e6970 2854455346464f0a 
>                          75726c2e65676170 46464f0a30383d29 
>                          696c677028544553 2e617461645f7473 
>                          6e6f7a5f65646f6e 464f0a303d297365 
>                          6c67702854455346 617461645f747369 
>                          656e6f7a5f726e2e 30363534313d2973 
>                          2854455346464f0a 645f7473696c6770 
>                          65646f6e2e617461 70616d5f6d656d5f 
>                          0a38363534313d29 702854455346464f 
>                          61645f7473696c67 5f65646f6e2e6174 
>                          66705f7472617473 34383534313d296e 
>                          2854455346464f0a 645f7473696c6770 
>                          65646f6e2e617461 64656e6e6170735f 
>                          3d2973656761705f 464f0a3030363431 
>                          6c67702854455346 617461645f747369 
>                          64695f65646f6e2e 0a38303634313d29 
>                          7a2854455346464f 656572662e656e6f 
>                          323d29616572615f 455346464f0a3030 
>                          762e656e6f7a2854 3d29746174735f6d 
>                          5346464f0a323336 2e656e6f7a285445 
>                          5f64656e6e617073 393d297365676170 
>                          455346464f0a3633 615f656572662854 
>                          656572662e616572 303d297473696c5f 
>                          2854455346464f0a 6165685f7473696c 
>                          3d297478656e2e64 54455346464f0a30 
>                          65685f7473696c28 29766572702e6461 
>                          54474e454c0a383d 662e656e6f7a2848 
>                          616572615f656572 4d59530a31313d29 
>                          65646f6e284c4f42 663d29617461645f 
>                          3866666666666666 0a30346561303730 
>                          6e284854474e454c 617461645f65646f 
>                          4152430a34363d29 313d454d49544853 
>                          3938333339383831 
> p_vaddr: ffffffff80200000 p_paddr: 200000 -> phys_base: 0
> 
> gdb /boot/vmlinux-2.6.23-rc3 
> GNU gdb 6.1
> Copyright 2004 Free Software Foundation, Inc.
> GDB is free software, covered by the GNU General Public License, and you are
> welcome to change it and/or distribute copies of it under certain conditions.
> Type "show copying" to see the conditions.
> There is absolutely no warranty for GDB.  Type "show warranty" for details.
> This GDB was configured as "x86_64-unknown-linux-gnu"...
> 
> <readmem: ffffffff8053bc40, KVADDR, "kernel_config_data", 32768, (ROE), 3723960>
> crash: CONFIG_NR_CPUS: 8
> crash: CONFIG_HZ: 250
> WARNING: Because this kernel was compiled with gcc version 4.1.1, certain
>          commands or command options may fail unless crash is invoked with
>          the  "--readnow" command line option.
> 
> GNU_GET_DATATYPE[runqueue]: returned via gdb_error_hook (1 buffer in use)
> GNU_GET_DATATYPE[runqueue]: returned via gdb_error_hook (1 buffer in use)
> GNU_GET_DATATYPE[prio_array]: returned via gdb_error_hook (1 buffer in use)
> GNU_GET_DATATYPE[prio_array]: returned via gdb_error_hook (1 buffer in use)
> GNU_GET_DATATYPE[prio_array]: returned via gdb_error_hook (1 buffer in use)
> GNU_GET_DATATYPE[irq_desc_t]: returned via gdb_error_hook (1 buffer in use)
> GNU_GET_DATATYPE[hw_interrupt_type]: returned via gdb_error_hook (1 buffer in use)
> GNU_GET_DATATYPE[irq_cpustat_t]: returned via gdb_error_hook (1 buffer in use)
> GNU_GET_DATATYPE[irq_cpustat_t]: returned via gdb_error_hook (1 buffer in use)
> GNU_GET_DATATYPE[irq_cpustat_t]: returned via gdb_error_hook (1 buffer in use)
> GNU_GET_DATATYPE[timer_vec_root]: returned via gdb_error_hook (1 buffer in use)
> GNU_GET_DATATYPE[timer_vec]: returned via gdb_error_hook (1 buffer in use)
> GNU_GET_DATATYPE[softirq_state]: returned via gdb_error_hook (1 buffer in use)
> GNU_GET_DATATYPE[kallsyms_header]: returned via gdb_error_hook (1 buffer in use)
> GNU_GET_DATATYPE[user_regs_struct]: returned via gdb_error_hook (1 buffer in use)
> GNU_GET_DATATYPE[user_regs_struct]: returned via gdb_error_hook (1 buffer in use)
> GNU_GET_DATATYPE[user_regs_struct]: returned via gdb_error_hook (1 buffer in use)
> GNU_GET_DATATYPE[user_regs_struct]: returned via gdb_error_hook (1 buffer in use)
> GNU_GET_DATATYPE[user_regs_struct]: returned via gdb_error_hook (1 buffer in use)
> GNU_GET_DATATYPE[user_regs_struct]: returned via gdb_error_hook (1 buffer in use)
> <readmem: ffffffff80837580, KVADDR, "xtime", 16, (FOE), a09c90>
> <readmem: ffffffff80685764, KVADDR, "init_uts_ns", 390, (ROE), a0a27c>
> <readmem: ffffffff80537000, KVADDR, "accessible check", 8, (ROE|Q), 7fff10bb8dc8>
> <readmem: ffffffff80537000, KVADDR, "readstring characters", 1499, (ROE|Q), 7fff10bb7db0>
> verify_namelist:
> /proc/version:
> Linux version 2.6.23-rc3 (rddunlap at unicorn.site) (gcc version 4.1.1 20070105 (Red Hat 4.1.1-52)) #19 SMP Tue Sep 4 09:52:06 PDT 2007
> utsname version: #19 SMP Tue Sep 4 09:52:06 PDT 2007
> /boot/vmlinux-2.6.23-rc3:
> Linux version 2.6.23-rc3 (rddunlap at unicorn.site) (gcc version 4.1.1 20070105 (Red Hat 4.1.1-52)) #22 SMP Thu Sep 6 21:24:54 PDT 2007
> 
> <readmem: ffffffff80707940, KVADDR, "_cpu_pda addr", 8, (FOE), 7fff10bba538>
> <readmem: 0, KVADDR, "cpu_pda entry", 128, (FOE), a3a820>
> crash: invalid kernel virtual address: 0  type: "cpu_pda entry"

A few things come to mind.  Walking through the debug data above...

The very first readmem() from the dumpfile is from the kernel symbol 
"kernel_config_data", where you can see that it found the CONFIG_HZ and
CONFIG_NR_CPUS values.  The next readmem()'s are of "xtime" and then 
"init_uts_ns".  We don't know what was read from the "xtime" location,
but the utsname data from "init_uts_ns" gets displayed later on here:

 > utsname version: #19 SMP Tue Sep 4 09:52:06 PDT 2007

And then the "linux_banner" address of ffffffff80537000 is first
checked for accessibility (OK), and then it is read successfully,
and its contents are displayed here:

 > /proc/version:
 > Linux version 2.6.23-rc3 (rddunlap at unicorn.site) (gcc version 4.1.1 20070105 
(Red Hat 4.1.1-52)) #19 SMP Tue Sep 4 09:52:06 PDT 2007

The string above from the dumpfile is correlated against the
linux_banner string in the vmlinux file, which is subsequently
displayed here:

 > /boot/vmlinux-2.6.23-rc3:
 > Linux version 2.6.23-rc3 (rddunlap at unicorn.site) (gcc version 4.1.1 20070105 
(Red Hat 4.1.1-52)) #22 SMP Thu Sep 6 21:24:54 PDT 2007

The utsname data and the linux_banner string from the dumpfile
are from "Tue Sep 4 09:52:06 PDT 2007", whereas the vmlinux file
was built 2 days later at "Thu Sep 6 21:24:54 PDT 2007".  I don't
know whether that's the issue or not.  Is there a reason that
you are *not* using the same vmlinux that the dumpfile was created
from?

So the first thing to verify is that you use the same vmlinux
that was booted and dumped.  If you cannot dig up the original
vmlinux file, get the System.map file from the dumped kernel,
and throw that on the command line, and see if that helps:

  $ crash vmlinux vmcore System.map

Anyway, next it reads the _cpu_pda[0] at address ffffffff80707940 to
find the address of cpu 0's x8664_pda structure:

 > <readmem: ffffffff80707940, KVADDR, "_cpu_pda addr", 8, (FOE), 7fff10bba538>

But it finds a zero there:

 > <readmem: 0, KVADDR, "cpu_pda entry", 128, (FOE), a3a820>
 > crash: invalid kernel virtual address: 0  type: "cpu_pda entry"

At this point crash is done, the readmem() is "FOE" (fault-on-error),
because there's no sense in continuing.

If the vmlinux and dumpfile are different, it's possible that the
_cpu_pda[] array, which is the highest address read so far, (the
xtime data which is even higher may be garbage as well), may
have been "pushed up" by some other changes in the kernel?

Or, if they do "line up", something may have changed with respect
to the kernel's _cpu_pda[] handling or its data declaration

Or, it actually read zeroes from the dumpfile.

But, for now let's suppose that the two kernels are identical except
for the date in the linux_banner strings.  I don't have a 2.6.23
kernel source tree handy, but at least as of 2.6.22-5, it was still
declared statically like so:

   struct x8664_pda *_cpu_pda[NR_CPUS] __read_mostly;

Has that changed?

If not, it would be worth checking a dumpfile with no pages
excluded with makedumpfile.  I wouldn't think the in-kernel
part of the vmcoreinfo patches would make a difference, but
I suppose anything's possible.

You also mentioned that gdb worked OK.  What happens when
you enter this:

   (gdb) p _cpu_pda[0]

And if you enter:

   (gdb) p &_cpu_pda[0]

does it show 0xffffffff80707940?  Which is what crash thinks is
the correct address:

 > <readmem: ffffffff80707940, KVADDR, "_cpu_pda addr", 8, (FOE), 7fff10bba538>

But again -- the very first thing to do is make sure that you
are using the exact same vmlinux as was booted/dumped.

Dave








More information about the kexec mailing list