[PATCH][ia64] Fix the difference between node_mem_map and node_start_pfn. (Re: makedumpfile fails on SGI machine)
Jay Lan
jlan at sgi.com
Fri Aug 29 16:12:03 EDT 2008
Jay Lan wrote:
> Ken'ichi Ohmichi wrote:
>> Hi Bernhard, Jay,
>>
>> Bernhard Walle wrote:
>>> Hi Ken'ichi Ohmichi,
>>>
>>> * Jay Lan [2008-08-27 18:43]:
>>>> Thanks for your patch!
>>>>
>>>> I am wondering if the discontigmem kernel has a legitimate bug,
>>>> we probably should report it?
>>>>
>>>> I tested your patch on a machine that used to fail in executing
>>>> 'makedumpfile'. It now generated a dump file fine.
>>> thanks for the patch, I can also report that with the patch and with
>>> vmlinux it works now.
>> Thank you for the report. It's a good news :-)
>>
>>
>>> However, shouldn't we add that vmem_map to VMCOREINFO of the kernel?
>
> Hmm, it failed on a shub2 machine. (The previous on that worked was a
> shub1 machine, an A3700. An A4700 is a shub2.) I assumed the warning
> about kernel version 2.6.27 not supported was harmless?
>
> a4700rac:~ # /bin/makedumpfile-1.2.7-0.2 -c -d31 -x
> /boot/vmlinux-2.6.27-rc4-vanilla /proc/vmcore /diskdump/dumpfile
> Can't distinguish the pgtable.
> The kernel version is not supported.
> The created dumpfile may be incomplete.
> Excluding unnecessary pages : [ 0 %] readmem: Can't convert a
> virtual address(a07ffff9df8f5800) to offset.
> create_2nd_bitmap: Can't exclude unnecessary pages.
>
> makedumpfile Failed.
> a4700rac:~ #
Sorry, it was from makedumpfile-1.2.7-0.2. I should have used 1.2.8.
Here is another test. I used 2.6.26 kernel with your kernel patch
and 1.2.8 makedumpfile.
a4700rac:~ # /bin/makedumpfile-1.2.8 -c -d31 -D /proc/vmcore
/diskdump/dumpfile
LOAD (0)
phys_start : 6044000000
phys_end : 60447bd920
virt_start : a000000100000000
virt_end : a0000001007bd920
[bunch of other LOAD messages]
Linux kdump
max_mapnr : 2069efeb
num of NODEs : 33
Memory type : DISCONTIGMEM
mem_map (0)
mem_map : 0
pfn_start : 0
pfn_end : 600300
[bunch of other mem_map messages]
Copying data : [ 0 %]
Oops, the kdump kernel MCA'ed at this point! The system was rebooted.
But at least makedumpfile understood the /proc/vmcore.
I think MCA on copying data is a different issue. It MCA'ed even on
'cp /proc/vmcore /diskdump/vmcore' :(
- jay
>
> I will try your kernel patch next.
>
> Regards,
> - jay
>
>> I think that we would rather fix the kernel bug than add vmem_map to
>> VMCOREINFO of the kernel. If fixing it, makedumpfile does not need
>> vmem_map.
>>
>> The attached patch fixes the kernel bug, and makedumpfile can work without
>> '-x' option. I tested it on my ia64 none-NUMA machine, and it works fine.
>> Could you test the attached patch on your machine again ?
>>
>>
>> Thanks
>> Ken'ichi Ohmichi
>>
>> ---
>> [PATCH][ia64] Fix the difference between node_mem_map and node_start_pfn.
>>
>> makedumpfile[1] cannot run on ia64 discontigmem kernel, because the member
>> node_mem_map of struct pgdat_list has invalid value. This patch fixes it.
>>
>> node_start_pfn shows the start pfn of each node, and node_mem_map should
>> point 'struct page' of each node's node_start_pfn.
>> On my machine, node0's node_start_pfn shows 0x400 and its node_mem_map points
>> 0xa0007fffbf000000. This address is the same as vmem_map, so the node_mem_map
>> points 'struct page' of pfn 0, even if its node_start_pfn shows 0x400.
>>
>> The cause is due to the round down of min_pfn in count_node_pages().
>> This patch fixes it.
>>
>>
>> makedumpfile[1]: dump filtering command
>> https://sourceforge.net/projects/makedumpfile/
>>
>> Signed-off-by: Ken'ichi Ohmichi <oomichi at mxs.nes.nec.co.jp>
>> ---
>> --- a/arch/ia64/mm/discontig.c 2008-08-29 23:05:52.000000000 +0900
>> +++ b/arch/ia64/mm/discontig.c 2008-08-29 23:06:59.000000000 +0900
>> @@ -631,7 +631,6 @@ static __init int count_node_pages(unsig
>> (min(end, __pa(MAX_DMA_ADDRESS)) - start) >>PAGE_SHIFT;
>> #endif
>> start = GRANULEROUNDDOWN(start);
>> - start = ORDERROUNDDOWN(start);
>> end = GRANULEROUNDUP(end);
>> mem_data[node].max_pfn = max(mem_data[node].max_pfn,
>> end >> PAGE_SHIFT);
>> --- a/include/asm-ia64/meminit.h 2008-08-29 23:06:36.000000000 +0900
>> +++ b/include/asm-ia64/meminit.h 2008-08-29 23:06:48.000000000 +0900
>> @@ -47,7 +47,6 @@ extern int reserve_elfcorehdr(unsigned l
>> */
>> #define GRANULEROUNDDOWN(n) ((n) & ~(IA64_GRANULE_SIZE-1))
>> #define GRANULEROUNDUP(n) (((n)+IA64_GRANULE_SIZE-1) & ~(IA64_GRANULE_SIZE-1))
>> -#define ORDERROUNDDOWN(n) ((n) & ~((PAGE_SIZE<<MAX_ORDER)-1))
>>
>> #ifdef CONFIG_NUMA
>> extern void call_pernode_memory (unsigned long start, unsigned long len, void *func);
>> _
More information about the kexec
mailing list