makedumpfile saving vmcore fails with dynamically allocated mem_section (was: Re: [PATCH] handle renamed init_level4_pgt -> init_top_pgt)
Dave Young
dyoung at redhat.com
Thu Jan 4 19:19:55 PST 2018
On 01/05/18 at 10:38am, Dave Young wrote:
> On 01/03/18 at 03:21pm, Dave Anderson wrote:
> >
> >
> > ----- Original Message -----
> >
> > > On 01/02/18 at 05:08pm, Baoquan He wrote:
> > > > On 01/02/18 at 04:57pm, Dave Young wrote:
> > > > > The root cause is this commit makes mem_section as a pointer instead of
> > > > > the static array.
> > > > >
> > > > > VMCOREINFO_SYMBOL() expand it as &mem_section, this is not correct in
> > > > > the test case any more.
> > > > >
> > > > > This hack code works for me:
> > > > >
> > > > > diff --git a/kernel/crash_core.c b/kernel/crash_core.c
> > > > > index b3663896278e..f5fe6068ae39 100644
> > > > > --- a/kernel/crash_core.c
> > > > > +++ b/kernel/crash_core.c
> > > > > @@ -376,6 +376,8 @@ phys_addr_t __weak paddr_vmcoreinfo_note(void)
> > > > > {
> > > > > return __pa(vmcoreinfo_note);
> > > > > }
> > > > > +#define VMCOREINFO_SYMBOL_HACK(name) \
> > > > > + vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)name)
> > > >
> > > > Oh, you made a new one. We may use vmcoreinfo_append_str directly since
> > > > there's an existing one in crash_save_vmcoreinfo():
> > >
> > > Yes, it should be something like this instead, this should ensure
> > > makedumpfile maybe crash tool works without any modifications,
> > > waiting for feedback from Atsushi, also ccing Dave for crash utility
> > > potential issue.
> >
> > Yeah, I ran into that issue when testing 4.15, and fixed it upstream:
> >
> > https://github.com/crash-utility/crash/commit/264f22dafe9f37780c4113fd08e8d5b2138edbce
> >
> > commit 264f22dafe9f37780c4113fd08e8d5b2138edbce
> > Author: Dave Anderson <anderson at redhat.com>
> > Date: Wed Nov 29 15:28:41 2017 -0500
> >
> > Fix for Linux 4.15 and later kernels that are configured with
> > CONFIG_SPARSEMEM_EXTREME, and that contain kernel commit
> > 83e3c48729d9ebb7af5a31a504f3fd6aff0348c4, titled "mm/sparsemem:
> > Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y".
> > Without the patch, kernels configured with SPARSEMEM_EXTREME
> > have changed the data type of "mem_section" from an array to
> > a pointer, leading to errors in commands such as "kmem -p",
> > "kmem -n", "kmem -s", and any other command that translates a
> > physical address to its page struct address.
> > (anderson at redhat.com)
> >
>
> Hi Dave,
>
> I tried latest crash tool, but segment fault happened, I put the
> vmlinux/vmcore in below url, could you have a look?:
> http://people.redhat.com/ruyang/rhbz1528542/127.0.0.1-2018-01-05-10:29:23/
>
> ---
> crash 7.2.0++
> Copyright (C) 2002-2017 Red Hat, Inc.
> Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation
> Copyright (C) 1999-2006 Hewlett-Packard Co
> Copyright (C) 2005, 2006, 2011, 2012 Fujitsu Limited
> Copyright (C) 2006, 2007 VA Linux Systems Japan K.K.
> Copyright (C) 2005, 2011 NEC Corporation
> Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc.
> Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
> This program is free software, covered by the GNU General Public License,
> and you are welcome to change it and/or distribute copies of it under
> certain conditions. Enter "help copying" to see the conditions.
> This program has absolutely no warranty. Enter "help warranty" for details.
>
> GNU gdb (GDB) 7.6
> Copyright (C) 2013 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law. Type "show copying"
> and "show warranty" for details.
> This GDB was configured as "x86_64-unknown-linux-gnu"...
>
> WARNING: could not find MAGIC_START!
> WARNING: kernel version inconsistency between vmlinux and dumpfile
>
> please wait... (gathering task table data)
> WARNING: duplicate idle tasks?
>
> WARNING: duplicate idle tasks?
> please wait... (determining panic task)
> WARNING: active task ffffffff81e90536 on cpu 0 not found in PID hash
>
> [ 1156.567444] crash[2964]: segfault at 7175228 ip 00000000004bf39e sp 00007ffd5dce3f30 error 6 in crash[400000+6da000]
> Segmentation fault (core dumped)
>
> The tested kernel applied below patch (a tuned patch according to bhe's
> suggestion):
>
> ---
> kernel/crash_core.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> --- linux.orig/kernel/crash_core.c
> +++ linux/kernel/crash_core.c
> @@ -410,7 +410,8 @@ static int __init crash_save_vmcoreinfo_
> VMCOREINFO_SYMBOL(contig_page_data);
> #endif
> #ifdef CONFIG_SPARSEMEM
> - VMCOREINFO_SYMBOL(mem_section);
> + vmcoreinfo_append_str("SYMBOL(mem_section)=%lx\n",
> + (unsigned long)mem_section);
> VMCOREINFO_LENGTH(mem_section, NR_SECTION_ROOTS);
> VMCOREINFO_STRUCT_SIZE(mem_section);
> VMCOREINFO_OFFSET(mem_section, section_mem_map);
>
>
BTW, crash works fine with CONFIG_SPARSEMEM_EXTREME being disabled.
More information about the kexec
mailing list