[PATCH] makedumpfile: call initial before use cache

Lichen Liu lichliu at redhat.com
Sat Jul 20 00:38:35 PDT 2024


Thanks for your explanation!


On Wed, Jul 17, 2024 at 3:44 PM HAGIO KAZUHITO(萩尾 一仁)
<k-hagio-ab at nec.com> wrote:
>
> Hi Lichen,
>
> sorry for the long delay.
>
> On 2024/06/25 10:57, Lichen Liu wrote:
> > Run 'makedumpfile --mem-usage /proc/kcore' will coredump on ppc64, it is
> > because show_mem_usage()->get_page_offset()->get_versiondep_info_ppc64()
> > ->readmem() use cache before it is inited by initial().
> >
> > Currently only ppc64 has this issue because only
> > get_versiondep_info_ppc64() call readmem().
> >
> > Signed-off-by: Lichen Liu <lichliu at redhat.com>
> > ---
> >   makedumpfile.c | 6 +++---
> >   1 file changed, 3 insertions(+), 3 deletions(-)
> >
> > diff --git a/makedumpfile.c b/makedumpfile.c
> > index 5b34712..6a42264 100644
> > --- a/makedumpfile.c
> > +++ b/makedumpfile.c
> > @@ -12019,6 +12019,9 @@ int show_mem_usage(void)
> >               DEBUG_MSG("Read vmcoreinfo from NOTE segment: %d\n", vmcoreinfo);
> >       }
> >
> > +     if (!initial())
> > +             return FALSE;
> > +
> >       if (!get_page_offset())
> >               return FALSE;
> >
> > @@ -12034,9 +12037,6 @@ int show_mem_usage(void)
> >                       return FALSE;
> >       }
> >
> > -     if (!initial())
> > -             return FALSE;
> > -
> >       if (!open_dump_bitmap())
> >               return FALSE;
> >
>
> initial() needs to be called after set_kcore_vmcoreinfo(), when there is
> no vmcoreinfo in /proc/kcore ELF note.
>
> So with the patch, "makedumpfile --mem-usage" fails on kernels that do
> not have a vmcoreinfo in ELF note, e.g. RHEL7 kernel:
>
>    # makedumpfile-dev -f --mem-usage /proc/kcore
>    exclude_free_page: Can't get necessary symbols for excluding free pages.
>
>    makedumpfile Failed.
>
>
> Probably readmem() should not be called before initial() in the first
> place.  I think it's the root cause, but I'm not sure how we can fix it.
>
> A workaround I thought of is that moving get_page_offset() and
> get_phys_base() into the !vmcoreinfo block.  These are needed by
> set_kcore_vmcoreinfo(), so we can avoid calling them if there is a
> vmcoreinfo in ELF note:
>
> --- a/makedumpfile.c
> +++ b/makedumpfile.c
> @@ -12019,14 +12019,14 @@ int show_mem_usage(void)
>                  DEBUG_MSG("Read vmcoreinfo from NOTE segment: %d\n",
> vmcoreinfo);
>          }
>
> -       if (!get_page_offset())
> -               return FALSE;
> +       if (!vmcoreinfo) {
> +               if (!get_page_offset())
> +                       return FALSE;
>
> -       /* paddr_to_vaddr() on arm64 needs phys_base. */
> -       if (!get_phys_base())
> -               return FALSE;
> +               /* paddr_to_vaddr() on arm64 needs phys_base. */
> +               if (!get_phys_base())
> +                       return FALSE;
>
> -       if (!vmcoreinfo) {
>                  if (!get_sys_kernel_vmcoreinfo(&vmcoreinfo_addr,
> &vmcoreinfo_len))
>                          return FALSE;
>
>
> This will work only for 4.19 and later kernels, but might reduce users
> that hit the issue.  Does this work for you?
That works for me because I'm testing for the 6.x kernel.

Thanks,
Lichen
>
> Thanks,
> Kazu




More information about the kexec mailing list