makedumpfile saving vmcore fails with dynamically allocated mem_section (was: Re: [PATCH] handle renamed init_level4_pgt -> init_top_pgt)

Dave Young dyoung at redhat.com
Mon Jan 8 00:40:16 PST 2018


On 01/06/18 at 11:06am, Dave Anderson wrote:
> 
> 
> ----- Original Message -----
> > On 01/05/18 at 09:16am, Dave Anderson wrote:
> > > 
> > > 
> > > ----- Original Message -----
> > > > On 01/03/18 at 03:21pm, Dave Anderson wrote:
> > > > > 
> > > > > 
> > > > > ----- Original Message -----
> > > > > 
> > > > > > On 01/02/18 at 05:08pm, Baoquan He wrote:
> > > > > > > On 01/02/18 at 04:57pm, Dave Young wrote:
> > > > > > > > The root cause is this commit makes mem_section as a pointer
> > > > > > > > instead
> > > > > > > > of
> > > > > > > > the static array.
> > > > > > > > 
> > > > > > > > VMCOREINFO_SYMBOL() expand it as &mem_section, this is not
> > > > > > > > correct in
> > > > > > > > the test case any more.
> > > > > > > > 
> > > > > > > > This hack code works for me:
> > > > > > > > 
> > > > > > > > diff --git a/kernel/crash_core.c b/kernel/crash_core.c
> > > > > > > > index b3663896278e..f5fe6068ae39 100644
> > > > > > > > --- a/kernel/crash_core.c
> > > > > > > > +++ b/kernel/crash_core.c
> > > > > > > > @@ -376,6 +376,8 @@ phys_addr_t __weak
> > > > > > > > paddr_vmcoreinfo_note(void)
> > > > > > > >  {
> > > > > > > >  	return __pa(vmcoreinfo_note);
> > > > > > > >  }
> > > > > > > > +#define VMCOREINFO_SYMBOL_HACK(name) \
> > > > > > > > +	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned
> > > > > > > > long)name)
> > > > > > > 
> > > > > > > Oh, you made a new one. We may use vmcoreinfo_append_str directly
> > > > > > > since
> > > > > > > there's an existing one in crash_save_vmcoreinfo():
> > > > > > 
> > > > > > Yes, it should be something like this instead, this should ensure
> > > > > > makedumpfile maybe crash tool works without any modifications,
> > > > > > waiting for feedback from Atsushi, also ccing Dave for crash utility
> > > > > > potential issue.
> > > > > 
> > > > > Yeah, I ran into that issue when testing 4.15, and fixed it upstream:
> > > > > 
> > > > >   https://github.com/crash-utility/crash/commit/264f22dafe9f37780c4113fd08e8d5b2138edbce
> > > > > 
> > > > >   commit 264f22dafe9f37780c4113fd08e8d5b2138edbce
> > > > >   Author: Dave Anderson <anderson at redhat.com>
> > > > >   Date:   Wed Nov 29 15:28:41 2017 -0500
> > > > > 
> > > > >     Fix for Linux 4.15 and later kernels that are configured with
> > > > >     CONFIG_SPARSEMEM_EXTREME, and that contain kernel commit
> > > > >     83e3c48729d9ebb7af5a31a504f3fd6aff0348c4, titled "mm/sparsemem:
> > > > >     Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y".
> > > > >     Without the patch, kernels configured with SPARSEMEM_EXTREME
> > > > >     have changed the data type of "mem_section" from an array to
> > > > >     a pointer, leading to errors in commands such as "kmem -p",
> > > > >     "kmem -n", "kmem -s", and any other command that translates a
> > > > >     physical address to its page struct address.
> > > > >     (anderson at redhat.com)
> > > > > 
> > > > 
> > > > Hi Dave,
> > > > 
> > > > I tried latest crash tool, but segment fault happened, I put the
> > > > vmlinux/vmcore in below url, could you have a look?:
> > > > http://people.redhat.com/ruyang/rhbz1528542/127.0.0.1-2018-01-05-10:29:23/
> > > 
> > > I can't access them:
> > > 
> > >   Forbidden
> > > 
> > >   You don't have permission to access
> > >   /ruyang/rhbz1528542/127.0.0.1-2018-01-05-10:29:23/vmcore on this server.
> > 
> > Oops, updated with chmod 666
> > > 
> > > Although, in the meantime, what should first be clarified is why this
> > > message occurred:
> > > 
> > >   WARNING: kernel version inconsistency between vmlinux and dumpfile
> > > 
> > > If you run with at least "crash -d1", it will display the version strings from
> > > the vmlinux file and the dumpfile, and they should be identical.
> > 
> > Sorry about that, maybe I mistakenly use another build:
> > dumpfile /proc/version:
> > Linux version 4.15.0-rc6+ (dyoung at dhcp-128-65.nay.redhat.com) (gcc version
> > 7.2.1 20170915 (Red Hat 7.2.1-4) (GCC)) #455 SMP Fri Jan 5 09:57:01 CST 2018
> > vmlinux:
> > Linux version 4.15.0-rc6+ (dyoung at dhcp-128-65.nay.redhat.com) (gcc version
> > 7.2.1 20170915 (Red Hat 7.2.1-4) (GCC)) #456 SMP Fri Jan 5 10:26:27 CST 2018
> > 
> > But should be some simple change over the vmcore_append_str as Baoquan's
> > suggestions, maybe I used Fedora default crash tool, I'm not sure now :(
> 
> No it's not the crash utility version that's the problem, but rather the 
> vmcore_append_str() addition, which would modify/increment the virtual 
> addresses of all static data symbols that come after it.  The crash utility
> doesn't stand a prayer of working when the data symbol values in the vmlinux
> do not match those in the vmcore.  

Ok, thanks for the reply.  If Atsushi can fix it in makedumpfile without
kernel patch it would be ideal.

> 
> Dave
> 
> 
> > 
> > Retest with a fresh build, crash works fine to me. Here are the new
> > files:
> > http://people.redhat.com/ruyang/rhbz1528542/127.0.0.1-2018-01-06-14:02:59/
> > 



More information about the kexec mailing list