makedumpfile saving vmcore fails with dynamically allocated mem_section (was: Re: [PATCH] handle renamed init_level4_pgt -> init_top_pgt)

Dave Young dyoung at redhat.com
Fri Jan 5 22:13:46 PST 2018


On 01/05/18 at 09:16am, Dave Anderson wrote:
> 
> 
> ----- Original Message -----
> > On 01/03/18 at 03:21pm, Dave Anderson wrote:
> > > 
> > > 
> > > ----- Original Message -----
> > > 
> > > > On 01/02/18 at 05:08pm, Baoquan He wrote:
> > > > > On 01/02/18 at 04:57pm, Dave Young wrote:
> > > > > > The root cause is this commit makes mem_section as a pointer instead
> > > > > > of
> > > > > > the static array.
> > > > > > 
> > > > > > VMCOREINFO_SYMBOL() expand it as &mem_section, this is not correct in
> > > > > > the test case any more.
> > > > > > 
> > > > > > This hack code works for me:
> > > > > > 
> > > > > > diff --git a/kernel/crash_core.c b/kernel/crash_core.c
> > > > > > index b3663896278e..f5fe6068ae39 100644
> > > > > > --- a/kernel/crash_core.c
> > > > > > +++ b/kernel/crash_core.c
> > > > > > @@ -376,6 +376,8 @@ phys_addr_t __weak paddr_vmcoreinfo_note(void)
> > > > > >  {
> > > > > >  	return __pa(vmcoreinfo_note);
> > > > > >  }
> > > > > > +#define VMCOREINFO_SYMBOL_HACK(name) \
> > > > > > +	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned
> > > > > > long)name)
> > > > > 
> > > > > Oh, you made a new one. We may use vmcoreinfo_append_str directly since
> > > > > there's an existing one in crash_save_vmcoreinfo():
> > > > 
> > > > Yes, it should be something like this instead, this should ensure
> > > > makedumpfile maybe crash tool works without any modifications,
> > > > waiting for feedback from Atsushi, also ccing Dave for crash utility
> > > > potential issue.
> > > 
> > > Yeah, I ran into that issue when testing 4.15, and fixed it upstream:
> > > 
> > >   https://github.com/crash-utility/crash/commit/264f22dafe9f37780c4113fd08e8d5b2138edbce
> > > 
> > >   commit 264f22dafe9f37780c4113fd08e8d5b2138edbce
> > >   Author: Dave Anderson <anderson at redhat.com>
> > >   Date:   Wed Nov 29 15:28:41 2017 -0500
> > > 
> > >     Fix for Linux 4.15 and later kernels that are configured with
> > >     CONFIG_SPARSEMEM_EXTREME, and that contain kernel commit
> > >     83e3c48729d9ebb7af5a31a504f3fd6aff0348c4, titled "mm/sparsemem:
> > >     Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y".
> > >     Without the patch, kernels configured with SPARSEMEM_EXTREME
> > >     have changed the data type of "mem_section" from an array to
> > >     a pointer, leading to errors in commands such as "kmem -p",
> > >     "kmem -n", "kmem -s", and any other command that translates a
> > >     physical address to its page struct address.
> > >     (anderson at redhat.com)
> > > 
> > 
> > Hi Dave,
> > 
> > I tried latest crash tool, but segment fault happened, I put the
> > vmlinux/vmcore in below url, could you have a look?:
> > http://people.redhat.com/ruyang/rhbz1528542/127.0.0.1-2018-01-05-10:29:23/
> 
> I can't access them:
> 
>   Forbidden
> 
>   You don't have permission to access /ruyang/rhbz1528542/127.0.0.1-2018-01-05-10:29:23/vmcore on this server.

Oops, updated with chmod 666 
> 
> Although, in the meantime, what should first be clarified is why this message occurred:
> 
>   WARNING: kernel version inconsistency between vmlinux and dumpfile
> 
> If you run with at least "crash -d1", it will display the version strings from
> the vmlinux file and the dumpfile, and they should be identical.

Sorry about that, maybe I mistakenly use another build:
dumpfile /proc/version:
Linux version 4.15.0-rc6+ (dyoung at dhcp-128-65.nay.redhat.com) (gcc version 7.2.1 20170915 (Red Hat 7.2.1-4) (GCC)) #455 SMP Fri Jan 5 09:57:01 CST 2018
vmlinux:
Linux version 4.15.0-rc6+ (dyoung at dhcp-128-65.nay.redhat.com) (gcc version 7.2.1 20170915 (Red Hat 7.2.1-4) (GCC)) #456 SMP Fri Jan 5 10:26:27 CST 2018

But should be some simple change over the vmcore_append_str as Baoquan's
suggestions, maybe I used Fedora default crash tool, I'm not sure now :(

Retest with a fresh build, crash works fine to me. Here are the new
files:
http://people.redhat.com/ruyang/rhbz1528542/127.0.0.1-2018-01-06-14:02:59/

> 
> Dave
> 
> 
> > 
> > ---
> > crash 7.2.0++
> > Copyright (C) 2002-2017  Red Hat, Inc.
> > Copyright (C) 2004, 2005, 2006, 2010  IBM Corporation
> > Copyright (C) 1999-2006  Hewlett-Packard Co
> > Copyright (C) 2005, 2006, 2011, 2012  Fujitsu Limited
> > Copyright (C) 2006, 2007  VA Linux Systems Japan K.K.
> > Copyright (C) 2005, 2011  NEC Corporation
> > Copyright (C) 1999, 2002, 2007  Silicon Graphics, Inc.
> > Copyright (C) 1999, 2000, 2001, 2002  Mission Critical Linux, Inc.
> > This program is free software, covered by the GNU General Public License,
> > and you are welcome to change it and/or distribute copies of it under
> > certain conditions.  Enter "help copying" to see the conditions.
> > This program has absolutely no warranty.  Enter "help warranty" for details.
> >  
> > GNU gdb (GDB) 7.6
> > Copyright (C) 2013 Free Software Foundation, Inc.
> > License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
> > This is free software: you are free to change and redistribute it.
> > There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
> > and "show warranty" for details.
> > This GDB was configured as "x86_64-unknown-linux-gnu"...
> > 
> > WARNING: could not find MAGIC_START!
> > WARNING: kernel version inconsistency between vmlinux and dumpfile
> > 
> > please wait... (gathering task table data)
> > WARNING: duplicate idle tasks?
> > 
> > WARNING: duplicate idle tasks?
> > please wait... (determining panic task)
> > WARNING: active task ffffffff81e90536 on cpu 0 not found in PID hash
> > 
> > [ 1156.567444] crash[2964]: segfault at 7175228 ip 00000000004bf39e sp
> > 00007ffd5dce3f30 error 6 in crash[400000+6da000]
> > Segmentation fault (core dumped)
> > 
> > The tested kernel applied below patch (a tuned patch according to bhe's
> > suggestion):
> > 
> > ---
> >  kernel/crash_core.c |    3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> > 
> > --- linux.orig/kernel/crash_core.c
> > +++ linux/kernel/crash_core.c
> > @@ -410,7 +410,8 @@ static int __init crash_save_vmcoreinfo_
> >  	VMCOREINFO_SYMBOL(contig_page_data);
> >  #endif
> >  #ifdef CONFIG_SPARSEMEM
> > -	VMCOREINFO_SYMBOL(mem_section);
> > +	vmcoreinfo_append_str("SYMBOL(mem_section)=%lx\n",
> > +			      (unsigned long)mem_section);
> >  	VMCOREINFO_LENGTH(mem_section, NR_SECTION_ROOTS);
> >  	VMCOREINFO_STRUCT_SIZE(mem_section);
> >  	VMCOREINFO_OFFSET(mem_section, section_mem_map);
> > 
> > 
> > Thanks
> > Dave
> > 



More information about the kexec mailing list