[PATCH] kexec: extend for large cpu count and memory

Wed Jun 16 09:36:09 EDT 2010

Simon,
  per your reply to my first version
  > Could you provide a diff against the current git tree?
  done
  > In particular, I think that the temp_region fiddling has already been done.
  dropped from this patch

The MAX_MEMORY_RANGES of 64 is too small for a very large NUMA machine.
(A 512 processor SGI UV, for example.)

And fix a temporary workaround (hack) in load_crashdump_segments() that
assumes that 16k is sufficient for the size of the crashdump elf header.
This is too small for a machine with a large cpu count. A PT_NOTE is created
in the elf header for each cpu.

Diffed against git.kernel.org/pub/scm/linux/kernel/git/horms/kexec-tools.git

Signed-off-by: Cliff Wickman <cpw at sgi.com>

---
 kexec/arch/i386/kexec-x86.h          |    2 +-
 kexec/arch/x86_64/crashdump-x86_64.c |   15 +++++++++++----
 2 files changed, 12 insertions(+), 5 deletions(-)

Index: kexec-tools/kexec/arch/i386/kexec-x86.h
===================================================================

--- kexec-tools.orig/kexec/arch/i386/kexec-x86.h
+++ kexec-tools/kexec/arch/i386/kexec-x86.h
@@ -1,7 +1,7 @@
 #ifndef KEXEC_X86_H
 #define KEXEC_X86_H
 
-#define MAX_MEMORY_RANGES 64
+#define MAX_MEMORY_RANGES 1024
 
 enum coretype {
 	CORE_TYPE_UNDEF = 0,
Index: kexec-tools/kexec/arch/x86_64/crashdump-x86_64.c
===================================================================
--- kexec-tools.orig/kexec/arch/x86_64/crashdump-x86_64.c
+++ kexec-tools/kexec/arch/x86_64/crashdump-x86_64.c
@@ -591,7 +591,7 @@ int load_crashdump_segments(struct kexec
 				unsigned long max_addr, unsigned long min_base)
 {
 	void *tmp;
-	unsigned long sz, elfcorehdr;
+	unsigned long sz, bufsz, memsz, elfcorehdr;
 	int nr_ranges, align = 1024, i;
 	struct memory_range *mem_range, *memmap_p;
 
@@ -637,9 +637,10 @@ int load_crashdump_segments(struct kexec
 	/* Create elf header segment and store crash image data. */
 	if (crash_create_elf64_headers(info, &elf_info,
 				       crash_memory_range, nr_ranges,
-				       &tmp, &sz,
+				       &tmp, &bufsz,
 				       ELF_CORE_HEADER_ALIGN) < 0)
 		return -1;
+	/* the size of the elf headers allocated is returned in 'bufsz' */
 
 	/* Hack: With some ld versions (GNU ld version 2.14.90.0.4 20030523),
 	 * vmlinux program headers show a gap of two pages between bss segment
@@ -648,9 +649,15 @@ int load_crashdump_segments(struct kexec
 	 * elf core header segment to 16K to avoid being placed in such gaps.
 	 * This is a makeshift solution until it is fixed in kernel.
 	 */
-	elfcorehdr = add_buffer(info, tmp, sz, 16*1024, align, min_base,
+	if (bufsz < (16*1024))
+		/* bufsize is big enough for all the PT_NOTE's and PT_LOAD's */
+		memsz = 16*1024;
+		/* memsz will be the size of the memory hole we look for */
+	else
+		memsz = bufsz;
+	elfcorehdr = add_buffer(info, tmp, bufsz, memsz, align, min_base,
 							max_addr, -1);
-	if (delete_memmap(memmap_p, elfcorehdr, sz) < 0)
+	if (delete_memmap(memmap_p, elfcorehdr, memsz) < 0)
 		return -1;
 	cmdline_add_memmap(mod_cmdline, memmap_p);
 	cmdline_add_elfcorehdr(mod_cmdline, elfcorehdr);