[PATCH makedumpfile] Fix incorrect PFN exclusion when LOAD segments overlap
HAGIO KAZUHITO(萩尾 一仁)
k-hagio-ab at nec.com
Mon Sep 30 01:18:01 PDT 2024
Hi Ming,
sorry for the late response to this patch.
On 2024/08/14 11:43, Ming Wang wrote:
> When iterating through LOAD segments to exclude pages in
> `exclude_nodata_pages`, simply using `phys_end` to track allocated
> PFNs can lead to the erroneous exclusion of valid pages from a
> subsequent LOAD segment if its `phys_start` overlaps with the
> previous segment's address range.
>
> This patch addresses the issue by checking for such overlaps with the
> next LOAD segment's physical address range. If an overlap is
> detected, we continue including PFNs from the current segment until
> reaching the end of the overlapping region.
>
> This fix ensures that all valid pages within overlapping LOAD segments
> are correctly included during PFN exclusion.
I have a few questions:
- I heard from Masa that the overlap is caused by a bug, isn't it fixed
on the kernel side? or it will be fixed but is this patch for existing
kernels that have the bug?
- exclude_nodata_pages() is for dumpfiles created by "makedumpfile -E",
which uses p_filesz in program headers to record existing pages after
exclusion. The function excludes pages only between
phys_start+file_size and phys_end, so if p_filesz == p_memsz, it will
not exclude any pages.
I would like to check how they are on your machine, so could I have a
"readelf -l /proc/vmcore" output?
- Do you mean that the bug always puts the two LOAD segments that have
the same phys_start in a row?
Thanks,
Kazu
>
> Signed-off-by: Ming Wang <wangming01 at loongson.cn>
> ---
> makedumpfile.c | 22 ++++++++++++++++++++--
> 1 file changed, 20 insertions(+), 2 deletions(-)
>
> diff --git a/makedumpfile.c b/makedumpfile.c
> index ce0d40e..f07f831 100644
> --- a/makedumpfile.c
> +++ b/makedumpfile.c
> @@ -5207,7 +5207,7 @@ static void
> exclude_nodata_pages(struct cycle *cycle)
> {
> int i;
> - unsigned long long phys_start, phys_end;
> + unsigned long long phys_start, phys_end, next_phys_start, next_phys_end, overlap;
> off_t file_size;
>
> i = 0;
> @@ -5222,7 +5222,25 @@ exclude_nodata_pages(struct cycle *cycle)
> pfn = cycle->start_pfn;
> if (pfn_end >= cycle->end_pfn)
> pfn_end = cycle->end_pfn;
> - while (pfn < pfn_end) {
> +
> + /**
> + * phys_start phys_end
> + * LOAD[ 0] 200000 2186200
> + * LOAD[ 1] 200000 ee00000
> + * LOAD[ 2] 90400000 b8a00000
> + *
> + * In the above example, When iterating through LOAD segments, simply excluding
> + * already allocated Physical Frames (PFNs) based on 'phys_end' can lead to an
> + * incorrect exclusion of valid pages within load[1], particularly in overlapping
> + * regions between load[0] and load[1].
> + */
> + get_pt_load_extents(i + 1, &next_phys_start, &next_phys_end, NULL, &file_size);
> + if (next_phys_start == phys_start && next_phys_end > phys_end)
> + overlap = TRUE;
> + else
> + overlap = FALSE;
> +
> + while (pfn < pfn_end && !overlap) {
> clear_bit_on_2nd_bitmap(pfn, cycle);
> ++pfn;
> }
More information about the kexec
mailing list