[PATCH] makedumpfile: call initial before use cache

HAGIO KAZUHITO(萩尾 一仁) k-hagio-ab at nec.com
Wed Jul 17 00:44:00 PDT 2024


Hi Lichen,

sorry for the long delay.

On 2024/06/25 10:57, Lichen Liu wrote:
> Run 'makedumpfile --mem-usage /proc/kcore' will coredump on ppc64, it is
> because show_mem_usage()->get_page_offset()->get_versiondep_info_ppc64()
> ->readmem() use cache before it is inited by initial().
> 
> Currently only ppc64 has this issue because only
> get_versiondep_info_ppc64() call readmem().
> 
> Signed-off-by: Lichen Liu <lichliu at redhat.com>
> ---
>   makedumpfile.c | 6 +++---
>   1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/makedumpfile.c b/makedumpfile.c
> index 5b34712..6a42264 100644
> --- a/makedumpfile.c
> +++ b/makedumpfile.c
> @@ -12019,6 +12019,9 @@ int show_mem_usage(void)
>   		DEBUG_MSG("Read vmcoreinfo from NOTE segment: %d\n", vmcoreinfo);
>   	}
>   
> +	if (!initial())
> +		return FALSE;
> +
>   	if (!get_page_offset())
>   		return FALSE;
>   
> @@ -12034,9 +12037,6 @@ int show_mem_usage(void)
>   			return FALSE;
>   	}
>   
> -	if (!initial())
> -		return FALSE;
> -
>   	if (!open_dump_bitmap())
>   		return FALSE;
>   

initial() needs to be called after set_kcore_vmcoreinfo(), when there is 
no vmcoreinfo in /proc/kcore ELF note.

So with the patch, "makedumpfile --mem-usage" fails on kernels that do 
not have a vmcoreinfo in ELF note, e.g. RHEL7 kernel:

   # makedumpfile-dev -f --mem-usage /proc/kcore
   exclude_free_page: Can't get necessary symbols for excluding free pages.

   makedumpfile Failed.


Probably readmem() should not be called before initial() in the first 
place.  I think it's the root cause, but I'm not sure how we can fix it.

A workaround I thought of is that moving get_page_offset() and 
get_phys_base() into the !vmcoreinfo block.  These are needed by 
set_kcore_vmcoreinfo(), so we can avoid calling them if there is a 
vmcoreinfo in ELF note:

--- a/makedumpfile.c
+++ b/makedumpfile.c
@@ -12019,14 +12019,14 @@ int show_mem_usage(void)
                 DEBUG_MSG("Read vmcoreinfo from NOTE segment: %d\n", 
vmcoreinfo);
         }

-       if (!get_page_offset())
-               return FALSE;
+       if (!vmcoreinfo) {
+               if (!get_page_offset())
+                       return FALSE;

-       /* paddr_to_vaddr() on arm64 needs phys_base. */
-       if (!get_phys_base())
-               return FALSE;
+               /* paddr_to_vaddr() on arm64 needs phys_base. */
+               if (!get_phys_base())
+                       return FALSE;

-       if (!vmcoreinfo) {
                 if (!get_sys_kernel_vmcoreinfo(&vmcoreinfo_addr, 
&vmcoreinfo_len))
                         return FALSE;


This will work only for 4.19 and later kernels, but might reduce users 
that hit the issue.  Does this work for you?

Thanks,
Kazu


More information about the kexec mailing list