[PATCH v3 1/2] Generic handling of multi-page exclusions

Petr Tesarik ptesarik at suse.cz
Wed May 14 01:41:11 PDT 2014


On Wed, 14 May 2014 07:54:17 +0000
Atsushi Kumagai <kumagai-atsushi at mxc.nes.nec.co.jp> wrote:

> Hello Petr,
> 
> >When multiple pages are excluded from the dump, store the extents in
> >struct cycle and check if anything is still pending on the next invocation
> >of __exclude_unnecessary_pages. This assumes that:
> >
> >  1. after __exclude_unnecessary_pages is called for a struct mem_map_data
> >     that extends beyond the current cycle, it is not called again during
> >     that cycle,
> >  2. in the next cycle, __exclude_unnecessary_pages is not called before
> >     this final struct mem_map_data.
> >
> >Both assumptions are met if struct mem_map_data segments:
> >
> >  1. do not overlap,
> >  2. are sorted by physical address in ascending order.
> 
> In ELF case, write_elf_pages_cyclic() processes PT_LOAD entries from
> PT_LOAD(0), this can break both assumptions unluckily.
> Actually this patch doesn't work on my machine:
> 
> LOAD (0)
>   phys_start : 1000000
>   phys_end   : 182f000
>   virt_start : ffffffff81000000
>   virt_end   : ffffffff8182f000
> LOAD (1)
>   phys_start : 1000
>   phys_end   : 9b400
>   virt_start : ffff810000001000
>   virt_end   : ffff81000009b400
> LOAD (2)
>   phys_start : 100000
>   phys_end   : 27000000
>   virt_start : ffff810000100000
>   virt_end   : ffff810027000000
> LOAD (3)
>   phys_start : 37000000
>   phys_end   : cff70000
>   virt_start : ffff810037000000
>   virt_end   : ffff8100cff70000
> LOAD (4)
>   phys_start : 100000000
>   phys_end   : 170000000
>   virt_start : ffff810100000000
>   virt_end   : ffff810170000000
> 
> 
> PT_LOAD(2) includes PT_LOAD(0) and there physical addresses aren't sorted.
> 
> If there is the only "sort issue", it may easy to fix it with a new iterator
> like "for_each_pt_load()", it iterates PT_LOAD entries in ascending order
> by physical address.
> However, I don't have a good idea to solve the overlap issue now...

I have. I can go back to my previous version and add those fields to
struct mem_map_data. I only changed it because of this feedback from
HATAYAMA Daisuke:

http://lists.infradead.org/pipermail/kexec/2014-April/011477.html

If I add the fields to struct mem_map_data, the code does not depend on
any specific call order.

OK, time for version 4.

Petr T

> Thanks
> Atsushi Kumagai
> 
> >These two conditions are true for all supported memory models.
> >
> >Note that the start PFN of the excluded extent is set to the end of the
> >current cycle (which is equal to the start of the next cycle, see
> >update_cycle), so only the part of the excluded region which falls beyond
> >current cycle buffer is valid. If the excluded region is completely
> >processed in the current cycle, the start PFN is bigger than the end PFN
> >and no work is done at the beginning of the next cycle.
> >
> >After processing the leftover from last cycle, pfn_start and mem_map are
> >adjusted to skip the excluded pages. There is no check whether the
> >adjusted pfn_start is within the current cycle. Nothing bad happens if
> >it isn't, because pages outside the current cyclic region are ignored by
> >the subsequent loop, and the remainder is postponed to the next cycle by
> >exclude_range().
> >
> >Signed-off-by: Petr Tesarik <ptesarik at suse.cz>
> >---
> > makedumpfile.c | 49 +++++++++++++++++++++++++++++++++++--------------
> > makedumpfile.h |  5 +++++
> > 2 files changed, 40 insertions(+), 14 deletions(-)
> >
> >diff --git a/makedumpfile.c b/makedumpfile.c
> >index 16081a5..a3498e4 100644
> >--- a/makedumpfile.c
> >+++ b/makedumpfile.c
> >@@ -4667,6 +4667,26 @@ initialize_2nd_bitmap_cyclic(struct cycle *cycle)
> > 	return TRUE;
> > }
> >
> >+static void
> >+exclude_range(mdf_pfn_t *counter, mdf_pfn_t pfn, mdf_pfn_t endpfn,
> >+    struct cycle *cycle)
> >+{
> >+	if (cycle) {
> >+		cycle->exclude_pfn_start = cycle->end_pfn;
> >+		cycle->exclude_pfn_end = endpfn;
> >+		cycle->exclude_pfn_counter = counter;
> >+
> >+		if (cycle->end_pfn < endpfn)
> >+			endpfn = cycle->end_pfn;
> >+	}
> >+
> >+	while (pfn < endpfn) {
> >+		if (clear_bit_on_2nd_bitmap_for_kernel(pfn, cycle))
> >+			(*counter)++;
> >+		++pfn;
> >+	}
> >+}
> >+
> > int
> > __exclude_unnecessary_pages(unsigned long mem_map,
> >     mdf_pfn_t pfn_start, mdf_pfn_t pfn_end, struct cycle *cycle)
> >@@ -4681,6 +4701,18 @@ __exclude_unnecessary_pages(unsigned long mem_map,
> > 	unsigned long flags, mapping, private = 0;
> >
> > 	/*
> >+	 * If a multi-page exclusion is pending, do it first
> >+	 */
> >+	if (cycle && cycle->exclude_pfn_start < cycle->exclude_pfn_end) {
> >+		exclude_range(cycle->exclude_pfn_counter,
> >+			cycle->exclude_pfn_start, cycle->exclude_pfn_end,
> >+			cycle);
> >+
> >+		mem_map += (cycle->exclude_pfn_end - pfn_start) * SIZE(page);
> >+		pfn_start = cycle->exclude_pfn_end;
> >+	}
> >+
> >+	/*
> > 	 * Refresh the buffer of struct page, when changing mem_map.
> > 	 */
> > 	pfn_read_start = ULONGLONG_MAX;
> >@@ -4744,21 +4776,10 @@ __exclude_unnecessary_pages(unsigned long mem_map,
> > 		if ((info->dump_level & DL_EXCLUDE_FREE)
> > 		    && info->page_is_buddy
> > 		    && info->page_is_buddy(flags, _mapcount, private, _count)) {
> >-			int i, nr_pages = 1 << private;
> >+			int nr_pages = 1 << private;
> >+
> >+			exclude_range(&pfn_free, pfn, pfn + nr_pages, cycle);
> >
> >-			for (i = 0; i < nr_pages; ++i) {
> >-				/*
> >-				 * According to combination of
> >-				 * MAX_ORDER and size of cyclic
> >-				 * buffer, this clearing bit operation
> >-				 * can overrun the cyclic buffer.
> >-				 *
> >-				 * See check_cyclic_buffer_overrun()
> >-				 * for the detail.
> >-				 */
> >-				if (clear_bit_on_2nd_bitmap_for_kernel((pfn + i), cycle))
> >-					pfn_free++;
> >-			}
> > 			pfn += nr_pages - 1;
> > 			mem_map += (nr_pages - 1) * SIZE(page);
> > 		}
> >diff --git a/makedumpfile.h b/makedumpfile.h
> >index eb03688..43cf91d 100644
> >--- a/makedumpfile.h
> >+++ b/makedumpfile.h
> >@@ -1593,6 +1593,11 @@ int get_xen_info_ia64(void);
> > struct cycle {
> > 	mdf_pfn_t start_pfn;
> > 	mdf_pfn_t end_pfn;
> >+
> >+	/* for excluding multi-page regions */
> >+	mdf_pfn_t exclude_pfn_start;
> >+	mdf_pfn_t exclude_pfn_end;
> >+	mdf_pfn_t *exclude_pfn_counter;
> > };
> >
> > static inline int
> >--
> >1.8.4.5




More information about the kexec mailing list