/proc/vmcore mmap() failure issue
kumagai-atsushi at mxc.nes.nec.co.jp
Thu Nov 21 00:00:09 EST 2013
On 2013/11/21 0:00:01, kexec <kexec-bounces at lists.infradead.org> wrote:
> > > > > Is there any chance that you could look into fixing this. I
> > > > > have no experience writing code for makedumpfile.
> > > >
> > > > I'll send a patch to fix this soon.
> > >
> > > Thanks Atsushi.
> > >
> > > Vivek
> > Vivek, could you test this patch ?
> > Thanks
> > Atsushi Kumagai
> > From: Atsushi Kumagai <kumagai-atsushi at mxc.nes.nec.co.jp>
> > Date: Wed, 20 Nov 2013 10:05:03 +0900
> > Subject: [PATCH] Disable mmap() for reading fractional pages.
> > Since mmap() was introduced on /proc/vmcore, it fails for fractional
> > pages which don't start or end at page boundary due to kernel issue.
> > This patch disables mmap() temporarily for fractional pages to avoid
> > this issue, so mmap() will be used only for aligned pages.
> > Signed-off-by: Atsushi Kumagai <kumagai-atsushi at mxc.nes.nec.co.jp>
> Hi Atsushi,
> Even with this patch applied I see mmap() failure.
> mem_map (39)
> mem_map : ffffea0004e00000
> pfn_start : 138000
> pfn_end : 140000
> read /proc/vmcore with mmap()
> Excluding unnecessary pages : [100.0 %] |STEP [Excluding
> unnecessary pages] : 0.035925 seconds
> Excluding unnecessary pages : [100.0 %] \STEP [Excluding
> unnecessary pages] : 0.035774 seconds
> Excluding unnecessary pages : [100.0 %] -STEP [Excluding
> unnecessary pages] : 0.035229 seconds
> Copying data : [ 40.9 %] -Can't map
> [b98fd000-b9cfd000] with mmap()
> read_from_vmcore: Can't read the dump memory(/proc/vmcore) with mmap().
> readpage_elf: Can't read the dump memory(/proc/vmcore).
> readmem: type_addr: 1, addr:bffba000, size:4096
> read_pfn: Can't get the page data.
> Resource temporarily unavailable
> makedumpfile Failed.
> kdump: saving vmcore failed
> Following is part of /proc/iomem on my system.
> 00100000-bffc283f : System RAM
> 01000000-018c551d : Kernel code
> 018c551e-01ef3f3f : Kernel data
> 0204a000-02984fff : Kernel bss
> 2e000000-35ffffff : Crash kernel
> bffc2840-bfffffff : reserved
> This is a different system than what I used last time. So I am not sure if this is same error or something else. But one thing is clear that System RAM last page is partial and we should face mmap() failure.
Thanks for your testing, I've found my mistake.
My patch tries to disable mmap() when a partial page is found, but
actually mmap() has already been called because update_mmap_range()
calls mmap() for every 4MB region in advance.
If we try to keep using mmap() as much as possible, update_mmap_range()
has to check whether the target region of mmap() includes the partial
pages before calling mmap(), but it's too tough as workaround.
So I think the patch I sent is enough, the policy will be simpler as
"Don't use mmap() for buggy kernels".
[PATCH] Fall back to read() when mmap() fails.
More information about the kexec