makedumpfile: get_max_mapnr() from ELF header problem
Atsushi Kumagai
kumagai-atsushi at mxc.nes.nec.co.jp
Mon Mar 24 21:14:21 EDT 2014
>> Now, I think this is a problem of get_mm_sparsemem() in makedumpfile.
>> To say in more detail, the problem is "wrong calculating the address
>> of unused mem_map".
>>
>> Looking at the log you sent, some addresses of mem_map corresponding
>> to unused pages look invalid like below:
>>
>> mem_map (256)
>> mem_map : 80000c0002018
>> pfn_start : 1000000
>> pfn_end : 1010000
>> mem_map (257)
>> mem_map : 800001840400000
>> pfn_start : 1010000
>> pfn_end : 1020000
>> ...
>> mem_map (544)
>> mem_map : a82400012f14fffc
>> pfn_start : 2200000
>> pfn_end : 2210000
>>
>> ...(and more)
>>
>> However, makedumpfile should calculate such unused mem_map addresses
>> as 0(NOT_MEMMAP_ADDR). Actually it works as expected at least in my
>> environment(x86_64):
>>
>> ...
>> mem_map (16)
>> mem_map : 0
>> pfn_start : 80000
>> pfn_end : 88000
>> mem_map (17)
>> mem_map : 0
>> pfn_start : 88000
>> pfn_end : 90000
>> ...
>>
>> makedumpfile get the address from mem_section.section_mem_map,
>> it will be initialized with zero:
>>
>> [CONFIG_SPARSEMEM_EXTREAM]
>> paging_init()
>> sparse_memory_present_with_active_regions()
>> memory_present()
>> sparse_index_init()
>> sparse_index_alloc() // allocate mem_section with kzalloc()
>>
>> makedumpfile assumes the value of unused mem_section will remain as 0,
>> but I suspect this assumption may be broken in your environment.
>
>No, I think your assumption is true also for my environment. For my
>dump the "mem_section" array is zero except for the first entry.
>
>crash> print/x mem_section
>$1 = {0x2fe6f800, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...}
>
>But it looks like get_mm_sparsemem() does not check for zero.
>The nr_to_section() function just returns an invalid address
>(something between 0 and 4096) for section in case we get zero
>from the "mem_section" entry. This is address is then used for
>calculating "mem_map":
In other architectures, the check by is_kaddr() avoids to
read invalid address, but it doesn't do anything in the case
of s390 due to the its memory management mechanism:
s390x: Fix KVBASE to correct value for s390x architecture.
http://lists.infradead.org/pipermail/kexec/2011-March/004930.html
Finally I've understood the cause of this issue completely,
thanks for your report.
>mem_map = section_mem_map_addr(section);
>mem_map = sparse_decode_mem_map(mem_map, section_nr);
>
>With the patch below I could use makedumpfile (1.5.3) successfully
>on the 1TB dump with mem=1G. I attached the -D output that is
>created by makedumpfile with the patch.
>
>But compared to my first patch it takes much longer and the resulting
>dump is bigger (version 1.5.3):
>
> | Dump time | Dump size
>-------------+-------------+-----------
>First patch | 10 sec | 124 MB
>Second patch | 87 minutes | 6348 MB
>
>No idea why the dump is bigger with the second patch. I think the time
>is consumed in write_kdump_pages_cyclic() by checking for zero pages
>for the whole range:
I suppose this difference was resolved with the v2 of the second patch,
right?
>
>5970 for (pfn = start_pfn; pfn < end_pfn; pfn++) {
>(gdb) n
>5972 if ((num_dumped % per) == 0)
>(gdb) n
>5978 if (!is_dumpable_cyclic(info->partial_bitmap2, pfn))
>(gdb) n
>5981 num_dumped++;
>(gdb) n
>5983 if (!read_pfn(pfn, buf))
>(gdb) n
>5989 if ((info->dump_level & DL_EXCLUDE_ZERO)
>(gdb) n
>5990 && is_zero_page(buf, info->page_size)) {
>(gdb) n
>5991 if (!write_cache(cd_header, pd_zero, sizeof(page_desc_t)))
>(gdb) n
>5993 pfn_zero++;
>(gdb) n
>5994 continue;
>
>(gdb) print end_pfn
>$3 = 268435456
>
>So the first patch would be better for my scenario. What in particular are your
>concerns with that patch?
I think the v2 second patch is a reasonable patch to fix the
bug of get_mm_sparsemem().
Additionally, the latest patch you posted to adjust max_mapnr
(which using mem_map_data[]) is acceptable instead of the first
patch.
So could you re-post the two as a formal patch set?
I mean patch descriptions and your signature are needed.
Thanks
Atsushi Kumagai
>Michael
>
>The following patch adds the zero check for "mem_section" entries
>---
> makedumpfile.c | 17 ++++++++++++-----
> 1 file changed, 12 insertions(+), 5 deletions(-)
>
>--- a/makedumpfile.c
>+++ b/makedumpfile.c
>@@ -2402,11 +2402,14 @@ nr_to_section(unsigned long nr, unsigned
> {
> unsigned long addr;
>
>- if (is_sparsemem_extreme())
>+ if (is_sparsemem_extreme()) {
>+ if (mem_sec[SECTION_NR_TO_ROOT(nr)] == 0)
>+ return NOT_KV_ADDR;
> addr = mem_sec[SECTION_NR_TO_ROOT(nr)] +
> (nr & SECTION_ROOT_MASK()) * SIZE(mem_section);
>- else
>+ } else {
> addr = SYMBOL(mem_section) + (nr * SIZE(mem_section));
>+ }
>
> if (!is_kvaddr(addr))
> return NOT_KV_ADDR;
>@@ -2490,10 +2493,14 @@ get_mm_sparsemem(void)
> }
> for (section_nr = 0; section_nr < num_section; section_nr++) {
> section = nr_to_section(section_nr, mem_sec);
>- mem_map = section_mem_map_addr(section);
>- mem_map = sparse_decode_mem_map(mem_map, section_nr);
>- if (!is_kvaddr(mem_map))
>+ if (section == NOT_KV_ADDR) {
> mem_map = NOT_MEMMAP_ADDR;
>+ } else {
>+ mem_map = section_mem_map_addr(section);
>+ mem_map = sparse_decode_mem_map(mem_map, section_nr);
>+ if (!is_kvaddr(mem_map))
>+ mem_map = NOT_MEMMAP_ADDR;
>+ }
> pfn_start = section_nr * PAGES_PER_SECTION();
> pfn_end = pfn_start + PAGES_PER_SECTION();
> if (info->max_mapnr < pfn_end)
More information about the kexec
mailing list