kdump: quad core Opteron

Chandru chandru at in.ibm.com
Wed Oct 8 09:40:29 EDT 2008


Vivek Goyal wrote:
> Hi Chandru,
>
> How much memory this system has got. Can you also paste the output of 
> /proc/iomem of first kernel.
>
> Does this system has GART? So looks like we are accessing some memory area
> which platform does not like. (We saw issues with GART in the past.)
>   
The system has 8GB of RAM. /proc/iomem shows the following without 
mem=4G boot parameter

[root at abc]# cat /proc/iomem

00000000-0009afff : System RAM
0009b000-0009ffff : reserved
000e0000-000fffff : reserved
00100000-cff9f6ff : System RAM
00200000-0048adc9 : Kernel code
0048adca-005ee18f : Kernel data
0076d000-00823a4b : Kernel bss
02000000-11ffffff : Crash kernel
20000000-23ffffff : GART
cff9f700-cffa6fff : ACPI Tables
cffa7000-cfffffff : reserved
d4000000-d41fffff : PCI Bus 0000:01
d4000000-d41fffff : PCI Bus 0000:02
d4000000-d41fffff : PCI Bus 0000:03
d4000000-d41fffff : 0000:03:00.0
d4200000-d421ffff : 0000:00:05.0
d6000000-d6ffffff : PCI Bus 0000:0a
d7000000-d7ffffff : PCI Bus 0000:09
d8000000-d8ffffff : PCI Bus 0000:08
d9000000-d9ffffff : PCI Bus 0000:07
db000000-dcffffff : PCI Bus 0000:05
dc000000-dcffffff : PCI Bus 0000:06
de000000-e70fffff : PCI Bus 0000:01
de000000-e50fffff : PCI Bus 0000:02
de000000-e2ffffff : PCI Bus 0000:04
de000000-dfffffff : 0000:04:00.0
de000000-dfffffff : bnx2
e0000000-e1ffffff : 0000:04:00.1
e0000000-e1ffffff : bnx2
e4000000-e50fffff : PCI Bus 0000:03
e5000000-e500ffff : 0000:03:00.0
e5000000-e500ffff : mpt
e5010000-e5013fff : 0000:03:00.0
e5010000-e5013fff : mpt
e7000000-e701ffff : 0000:01:00.0
e8000000-efffffff : 0000:00:05.0
f3fed000-f3fedfff : 0000:00:0f.2
f3fed000-f3fedfff : ehci_hcd
f3fee000-f3feefff : 0000:00:0f.1
f3fee000-f3feefff : ohci_hcd
f3fef000-f3feffff : 0000:00:0f.0
f3fef000-f3feffff : ohci_hcd
f3ff0000-f3ffffff : 0000:00:05.0
f4000000-fbffffff : reserved
fa000000-faafffff : PCI MMCONFIG 0
fec00000-ffffffff : reserved
fec00000-fec00fff : IOAPIC 0
fec02000-fec02fff : IOAPIC 1
fed00000-fed003ff : HPET 0
fee00000-fee00fff : Local APIC
100000000-22fffffff : System RAM



With mem=4G /proc/iomem is as follows. The GART memory range seems to be 
missing here

00000000-0009afff : System RAM
0009b000-0009ffff : reserved
000e0000-000fffff : reserved
00100000-cff9f6ff : System RAM
00200000-0048adc9 : Kernel code
0048adca-005ee18f : Kernel data
0076d000-00823a4b : Kernel bss
02000000-11ffffff : Crash kernel
cff9f700-cffa6fff : ACPI Tables
cffa7000-cfffffff : reserved
d4000000-d41fffff : PCI Bus 0000:01
d4000000-d41fffff : PCI Bus 0000:02
d4000000-d41fffff : PCI Bus 0000:03
d4000000-d41fffff : 0000:03:00.0
d4200000-d421ffff : 0000:00:05.0
d6000000-d6ffffff : PCI Bus 0000:0a
d7000000-d7ffffff : PCI Bus 0000:09
d8000000-d8ffffff : PCI Bus 0000:08
d9000000-d9ffffff : PCI Bus 0000:07
db000000-dcffffff : PCI Bus 0000:05
dc000000-dcffffff : PCI Bus 0000:06
de000000-e70fffff : PCI Bus 0000:01
de000000-e50fffff : PCI Bus 0000:02
de000000-e2ffffff : PCI Bus 0000:04
de000000-dfffffff : 0000:04:00.0
de000000-dfffffff : bnx2
e0000000-e1ffffff : 0000:04:00.1
e0000000-e1ffffff : bnx2
e4000000-e50fffff : PCI Bus 0000:03
e5000000-e500ffff : 0000:03:00.0
e5000000-e500ffff : mpt
e5010000-e5013fff : 0000:03:00.0
e5010000-e5013fff : mpt
e7000000-e701ffff : 0000:01:00.0
e8000000-efffffff : 0000:00:05.0
f3fed000-f3fedfff : 0000:00:0f.2
f3fed000-f3fedfff : ehci_hcd
f3fee000-f3feefff : 0000:00:0f.1
f3fee000-f3feefff : ohci_hcd
f3fef000-f3feffff : 0000:00:0f.0
f3fef000-f3feffff : ohci_hcd
f3ff0000-f3ffffff : 0000:00:05.0
f4000000-fbffffff : reserved
fa000000-faafffff : PCI MMCONFIG 0
fec00000-ffffffff : reserved
fec00000-fec00fff : IOAPIC 0
fec02000-fec02fff : IOAPIC 1
fed00000-fed003ff : HPET 0
fee00000-fee00fff : Local APIC


> Can you also provide /proc/vmcore ELF header (readelf output), in both
> the cases (mem=4G and without that).
>   
ELF header with mem=4G

ELF Header:
Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
Class: ELF64
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: CORE (Core file)
Machine: Advanced Micro Devices X86-64
Version: 0x1
Entry point address: 0x0
Start of program headers: 64 (bytes into file)
Start of section headers: 0 (bytes into file)
Flags: 0x0
Size of this header: 64 (bytes)
Size of program headers: 56 (bytes)
Number of program headers: 5
Size of section headers: 0 (bytes)
Number of section headers: 0
Section header string table index: 0

There are no sections in this file.

There are no sections in this file.

Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
NOTE 0x0000000000000158 0x0000000000000000 0x0000000000000000
0x0000000000000b20 0x0000000000000b20 0
LOAD 0x0000000000000c78 0xffffffff80200000 0x0000000000200000
0x0000000000624000 0x0000000000624000 RWE 0
LOAD 0x0000000000624c78 0xffff810000000000 0x0000000000000000
0x00000000000a0000 0x00000000000a0000 RWE 0
LOAD 0x00000000006c4c78 0xffff810000100000 0x0000000000100000
0x0000000001f00000 0x0000000001f00000 RWE 0
LOAD 0x00000000025c4c78 0xffff810012000000 0x0000000012000000
0x00000000bdf9f700 0x00000000bdf9f700 RWE 0

There is no dynamic section in this file.

There are no relocations in this file.

There are no unwind sections in this file.

No version information found in this file.

Notes at offset 0x00000158 with length 0x00000b20:
Owner Data size Description
CORE 0x00000150 NT_PRSTATUS (prstatus structure)
CORE 0x00000150 NT_PRSTATUS (prstatus structure)
CORE 0x00000150 NT_PRSTATUS (prstatus structure)
CORE 0x00000150 NT_PRSTATUS (prstatus structure)
CORE 0x00000150 NT_PRSTATUS (prstatus structure)
CORE 0x00000150 NT_PRSTATUS (prstatus structure)
CORE 0x00000150 NT_PRSTATUS (prstatus structure)
CORE 0x00000150 NT_PRSTATUS (prstatus structure)


------------------------------------------------------------------------------------

ELF header without mem=4G

ELF Header:
Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
Class: ELF64
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: CORE (Core file)
Machine: Advanced Micro Devices X86-64
Version: 0x1
Entry point address: 0x0
Start of program headers: 64 (bytes into file)
Start of section headers: 0 (bytes into file)
Flags: 0x0
Size of this header: 64 (bytes)
Size of program headers: 56 (bytes)
Number of program headers: 6
Size of section headers: 0 (bytes)
Number of section headers: 0
Section header string table index: 0

There are no sections in this file.

There are no sections in this file.

Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
NOTE 0x0000000000000190 0x0000000000000000 0x0000000000000000
0x0000000000000b20 0x0000000000000b20 0
LOAD 0x0000000000000cb0 0xffffffff80200000 0x0000000000200000
0x0000000000624000 0x0000000000624000 RWE 0
LOAD 0x0000000000624cb0 0xffff810000000000 0x0000000000000000
0x00000000000a0000 0x00000000000a0000 RWE 0
LOAD 0x00000000006c4cb0 0xffff810000100000 0x0000000000100000
0x0000000001f00000 0x0000000001f00000 RWE 0
LOAD 0x00000000025c4cb0 0xffff810012000000 0x0000000012000000
0x00000000bdf9f700 0x00000000bdf9f700 RWE 0
LOAD 0x00000000c05643b0 0xffff810100000000 0x0000000100000000
0x0000000130000000 0x0000000130000000 RWE 0

There is no dynamic section in this file.

There are no relocations in this file.

There are no unwind sections in this file.

No version information found in this file.

Notes at offset 0x00000190 with length 0x00000b20:
Owner Data size Description
CORE 0x00000150 NT_PRSTATUS (prstatus structure)
CORE 0x00000150 NT_PRSTATUS (prstatus structure)
CORE 0x00000150 NT_PRSTATUS (prstatus structure)
CORE 0x00000150 NT_PRSTATUS (prstatus structure)
CORE 0x00000150 NT_PRSTATUS (prstatus structure)
CORE 0x00000150 NT_PRSTATUS (prstatus structure)
CORE 0x00000150 NT_PRSTATUS (prstatus structure)
CORE 0x00000150 NT_PRSTATUS (prstatus structure)


> You can try putting some printk in /proc/vmcore code and see which
> physical memory area you are accessing when system goes bust. If in all
> the failure cases it is same physical memory area, then we can try to find
> what's so special about it.
> Thanks
> Vivek
>   

The vmcore-incomplete files are of different sizes at different runs ( 
18M, 32M.. ) and in case of n/w copy we get ( 190M, 198M ).
I tried with the patch priovided by Bob Montgomery and it seems like it 
is working on this machine.

Thanks,
Chandru



More information about the kexec mailing list