[PATCH] kexec: fix 64Gb limit on x86 w/ PAE

Fri Apr 9 07:05:45 EDT 2010

On Fri, Apr 09, 2010 at 11:41:47AM +1000, Simon Horman wrote:
> On Thu, Apr 08, 2010 at 09:24:39PM -0400, Neil Horman wrote:
> > On Fri, Apr 09, 2010 at 08:32:48AM +1000, Simon Horman wrote:
> > > On Thu, Apr 08, 2010 at 12:46:44PM -0400, Neil Horman wrote:
> > > > Fix up x86 kexec to exclude memory on i686 kernels beyond 64GB limit
> > > > 
> > > > We found a problem recently on x86 systems.  If a 32 bit PAE enabled system
> > > > contains more then 64GB of physical ram, the kernel will truncate the max_pfn
> > > > value to 64GB.  Unfortunately it still leaves all the physical memory regions
> > > > present in /proc/iomem.  Since kexec builds its elf headers based on
> > > > /proc/iomem the elf headers indicate the size of memory is larger than what the
> > > > kernel is willing to address.  The result is that, during a copy of
> > > > /proc/vmcore, a read will return -EFAULT when the requested offset is beyond the
> > > > 64GB range, leaving the seemingly truncated vmcore useless, as the elf headers
> > > > indicate memory beyond what the file contains.
> > > > 
> > > > The fix for it is pretty straightforward, just ensure that, when on x86 systems,
> > > > we don't record any entries in the memory_range array that cross  the 64Gb mark.
> > > > This keeps us in line with the kernel and lets the copy finish sucessfully,
> > > > providing a workable core
> > > 
> > > Hi Neil,
> > > 
> > > This seems reasonable to me.
> > > 
> > > > Tested successfully by myself
> > > > Originally-authored-by: Dave Anderson <anderson at redhat.com>
> > > > Signed-off-by: Neil Horman <nhorman at tuxdriver.com>
> > > > 
> > > > diff --git a/kexec/arch/i386/crashdump-x86.c b/kexec/arch/i386/crashdump-x86.c
> > > > index 9d37442..85879a9 100644
> > > > --- a/kexec/arch/i386/crashdump-x86.c
> > > > +++ b/kexec/arch/i386/crashdump-x86.c
> > > > @@ -114,6 +114,15 @@ static int get_crash_memory_ranges(struct memory_range **range, int *ranges,
> > > >  		if (end <= 0x0009ffff)
> > > >  			continue;
> > > >  
> > > > +		/*
> > > > +		 *  Exclude any segments starting at or beyond 64GB, and
> > > > +		 *  restrict any segments from ending at or beyond 64GB.
> > > > +		 */
> > > > +		if (start >= 0x1000000000)
> > > > +			continue;
> > > > +		if (end >= 0x1000000000)
> > > > +			end = 0xfffffffff;
> > > > +
> > > 
> > > Nit picking...
> > > 
> > > Might it be better to use 0xfffffffff (or 0x1000000000) consistently?
> > > 
> > > 		if (start > 0xfffffffff)
> > > 			continue;
> > > 		if (end > 0xfffffffff)
> > > 			end = 0xfffffffff;
> > > 
> > Not sure what you mean by consistent here?  It seems we are using it
> > consistently in this patch.  Or are you referring to updating the function as a
> > whole?
> 
> Sorry, yes they are consistent. And I believe the code you posted is correct.
> 
> What I meant was that as 0xfffffffff + 1  = 0x1000000000,
> the code could either only use 0xfffffffff or only use 0x1000000000.
> Which seems to make things slightly more obvious when reading the code.
> 
Ah, ok.  yeah, I'm fine with that.  I'll bump the value, change the comparison
to >=, macrotize the constant and repost.  Thanks!
Neil

> > > Or even make 0xfffffffff (or 0x1000000000) a #define ?
> > Yeah, that makes sense.  If you can clarify your above point on consistency, I
> > can repost.
> > 
> > thanks
> > Neil
> > 
> > > 
> > > >  		crash_memory_range[memory_ranges].start = start;
> > > >  		crash_memory_range[memory_ranges].end = end;
> > > >  		crash_memory_range[memory_ranges].type = type;
> > > > 
> > > > 
> > > > _______________________________________________
> > > > kexec mailing list
> > > > kexec at lists.infradead.org
> > > > http://lists.infradead.org/mailman/listinfo/kexec
> > > 
> 
> _______________________________________________
> kexec mailing list
> kexec at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec