[PATCH] vmcore: call remap_pfn_range() separately for respective partial pages

HATAYAMA Daisuke d.hatayama at jp.fujitsu.com
Wed Dec 4 04:05:52 EST 2013


(2013/12/04 0:12), Vivek Goyal wrote:
> On Tue, Dec 03, 2013 at 02:16:35PM +0900, HATAYAMA Daisuke wrote:
>
> [..]
>>> Even if copying partial pages into the 2nd kernel, we need to use ioremap()
>>> once on them, and I think the ioremap() is exactly similar to
>>> remap_pfn_range() for a single page. There seems no difference on safeness
>>> between them.
>>>
>>
>> I suspected some kind of pre-fetching could be performed when just page table
>> is created. But it's common thing between the two cases above. Then, as you say,
>> it would be safer to read less data from non-System-RAM area. Copying seems
>> better in our case.
>
> If we map page and just read *only* those bytes as advertized by ELF
> headers and fill rest of the bytes with zero, I think that seems like
> a right thing to do. We are only accessing what has been exported by
> ELF headers and not trying to read outside that range.
>
>>
>> Another concern to reading data from partial pages is a possibility of
>> undesirable hardware prefetch to non-System-RAM area. Is it better to disable this?
>
> If it becomes an issue, may be. I think we can fix it when we run into
> an actual issue.
>

I see.

>>
>>> Also, current /proc/vmcore shows user-land tools a shape with holes not
>>> filled with zeros both in case of read() and in case of mmap(). If we adapt
>>> copying one without reading data in holes, shape of /proc/vmcore gets
>>> changed again.
>>>
>>
>> So, next patch will change data in holes by filling them with zeros.
>>
>> BTW, we have now page cache interface implemented by Michael Holzheu, but
>> we have yet to use it on x86 because we've never wanted it so far. It's
>> natural to use it to read partial pages on-demand, but I also in part think
>> that it's not proper time to start using new mechanism that needs to be tested
>> more. How do you think?
>
> Do we gain anything significant by using that interface. To me it looks
> like this will just delay creation of mapping for partial pages. Does not
> save us any memory in second kernel?
>

Amount of partial pages seems not so problematic.
No, the number of partial pages are at most the number of System RAM entries * 2.
I guess the number of the entries doesn't exceed 100 even in the worst case.
Less than 1MiB even in the worst case.

The mechanism would be useful for platform that has large amount of note segment.
But per-cpu note segemnt of x86 is relatively small, so meaningful platform seems
the ones with a large number of cpus and they are still restrictive. Other archs
that has large per-cpu note segment would already want the feature, just as
Michael Holzheu explained previously for some of their s390 platform.

> I would think that in first pass, keep it simple. Copy partial pages in
> second kernel's memory. Read data as exported by ELF headers. Fill rest
> of the page with zeros. Adjust /proc/vmcore ELF headers accordingly and
> that should do it.
>

Yes, I also want to keep simplicity. I'll post a new patch tomorrow.

-- 
Thanks.
HATAYAMA, Daisuke




More information about the kexec mailing list