[PATCH v3 1/2] Generic handling of multi-page exclusions
kumagai-atsushi at mxc.nes.nec.co.jp
Fri May 16 00:24:02 PDT 2014
>On Wed, 14 May 2014 19:54:28 +0900 (JST)
>HATAYAMA Daisuke <d.hatayama at jp.fujitsu.com> wrote:
>> From: HATAYAMA Daisuke <d.hatayama at jp.fujitsu.com>
>> Subject: Re: [PATCH v3 1/2] Generic handling of multi-page exclusions
>> Date: Wed, 14 May 2014 19:37:23 +0900
>> > From: Atsushi Kumagai <kumagai-atsushi at mxc.nes.nec.co.jp>
>> > Subject: RE: [PATCH v3 1/2] Generic handling of multi-page exclusions
>> > Date: Wed, 14 May 2014 07:54:17 +0000
>> >> Hello Petr,
>> >>>When multiple pages are excluded from the dump, store the extents in
>> >>>struct cycle and check if anything is still pending on the next invocation
>> >>>of __exclude_unnecessary_pages. This assumes that:
>> >>> 1. after __exclude_unnecessary_pages is called for a struct mem_map_data
>> >>> that extends beyond the current cycle, it is not called again during
>> >>> that cycle,
>> >>> 2. in the next cycle, __exclude_unnecessary_pages is not called before
>> >>> this final struct mem_map_data.
>> >>>Both assumptions are met if struct mem_map_data segments:
>> >>> 1. do not overlap,
>> >>> 2. are sorted by physical address in ascending order.
>> >> In ELF case, write_elf_pages_cyclic() processes PT_LOAD entries from
>> >> PT_LOAD(0), this can break both assumptions unluckily.
>> >> Actually this patch doesn't work on my machine:
>> >> LOAD (0)
>> >> phys_start : 1000000
>> >> phys_end : 182f000
>> >> virt_start : ffffffff81000000
>> >> virt_end : ffffffff8182f000
>> >> LOAD (1)
>> >> phys_start : 1000
>> >> phys_end : 9b400
>> >> virt_start : ffff810000001000
>> >> virt_end : ffff81000009b400
>> >> LOAD (2)
>> >> phys_start : 100000
>> >> phys_end : 27000000
>> >> virt_start : ffff810000100000
>> >> virt_end : ffff810027000000
>> >> LOAD (3)
>> >> phys_start : 37000000
>> >> phys_end : cff70000
>> >> virt_start : ffff810037000000
>> >> virt_end : ffff8100cff70000
>> >> LOAD (4)
>> >> phys_start : 100000000
>> >> phys_end : 170000000
>> >> virt_start : ffff810100000000
>> >> virt_end : ffff810170000000
>> >> PT_LOAD(2) includes PT_LOAD(0) and there physical addresses aren't sorted.
>> >> If there is the only "sort issue", it may easy to fix it with a new iterator
>> >> like "for_each_pt_load()", it iterates PT_LOAD entries in ascending order
>> >> by physical address.
>> >> However, I don't have a good idea to solve the overlap issue now...
>> > Is it enough to merge them? Prepare a modified version of PTLOAD list
>> > and refer to it in actual processing. I think this also leads to
>> > cleaning up readpage_elf() that addresses some overapping memory map
>> > issue on ia64.
>> I'm saying this because I don't find anywhere virt_start or virt_end
>> is used. We look up page table to convert virtual address to physical
>> address, not PT_LOAD entries.
I thought it's better to keep the original PT_LOAD list at first, but the
current code can split it already. So I think we shouldn't worry about
modification to PT_LOAD entries now.
If crash doesn't use virt_start and virt_end too, your idea sounds good to me.
>Oh, you're right! Why does the ordering of PT_LOAD segments matter here?
>If makedumpfile fails on your machine after applying my patches, then
>it's quite likely because of something else.
Hatayama-san must said a VtoP mapping included in a PT_LOAD will be lost
by merging PT_LOADs and it looks no problem for makedumpfile in practice.
I meant the ELF path calls for_each_cycle(PT_LOAD[i].pfn_start, PT_LOAD[i].pfn_end)
from i=0, so the PFNs aren't sorted and they are overlapping in some cases
like mine. OTOH, the kdump path calls for_each_cycle(0, info->max_mapnr) always,
so there is no problem.
I explain why the problem I met happen below, each paragraph means the flow
1. PT_LOAD(0): pfn [0x00001000 - 0x0000182f]
There are free pages [0x1820-0x1840] and exclude_range(0x1820, 0x1840)
was called. Then, exclude_pfn_start was set as 0x182f and exclude_pfn_end
was set as 0x1840. (I'll express this like "exclude_pfn [0x182f-0x1840]")
2. PT_LOAD(1): pfn [0x00000001 - 0x0000009c]
At the top of __exclude_unnecessary_pages(), exclude_range(0x182f, 0x1840)
was called since exclude_pfn was [0x182f-0x1840]. This exclude_range() didn't
any operations for bitmaps because the PFNs are out of the cycle. However,
exclude_pfn was set as [0x9c-0x1840] in the function.
3. PT_LOAD(2): pfn [0x00000100 - 0x00027000]
Here, exclude_pfn was [0x9c-0x1840], so exclude_range(0x9c, 0x1840) was called.
A part of the PFNs are on the cycle unluckily, they are excluded wrongly.
The PFNs that should be excluded are only [0x182f-0x1840], this is the problem.
4. PT_LAOD(3): pfn [0x00037000 - 0x000cff70]
There is no problem here.
5. PT_LOAD(4): pfn [0x00100000 - 0x00170000]
There is no problem here too.
Like this, the unsorted issue causes wrong setting of
exclude_pfn_(start|end) and the combination of the overlapping issue and
wrong exclude_pfn_(start|end) causes wrong bitmap operations.
>FWIW I verified on a few dumpfiles that makedumpfile produced exactly
>the same output before and after applying the patches.
>OTOH I can see a warning when writing an ELF file. Before the patch:
>Excluding unnecessary pages : [100.0 %] \WARNING: PFN out of cycle range. (pfn:c00, cycle:[3fc00-3ffd0])
>After the patch:
>Excluding unnecessary pages : [100.0 %] \WARNING: PFN out of cycle range. (pfn:26c00, cycle:[0-1ff6])
>I'm unsure why there are out-of-cycle PFNs. Researching...
>kexec mailing list
>kexec at lists.infradead.org
More information about the kexec