[patch 0/9] kdump: Patch series for s390 support

Mon Jul 18 10:19:43 EDT 2011

On Mon, Jul 18, 2011 at 04:00:41PM +0200, Michael Holzheu wrote:
> Hello Vivek,
> 
> On Mon, 2011-07-18 at 08:31 -0400, Vivek Goyal wrote:
> > On Fri, Jul 15, 2011 at 05:43:23PM +0200, Michael Holzheu wrote:
> > > > Or in first step we can keep it even simpler. We can spin in infinite
> > > > loop
> > > 
> > > Looping is probably not a good option in a hypervisor environment like
> > > we have it on s390. At least we should load a disabled wait PSW.
> > 
> > What is "disabled wait PSW"?
> 
> This is a PSW where interrupts are disabled and the wait bit is on. This
> ensures that the virtual CPU is stopped and does not consume any CPU
> time.
>  
> > > > In your case I think you shall have to do little more so that second
> > > > kernel also seems some of the lower memory areas so that later swapping
> > > > of kernel can be done.
> > > 
> > > After the swap the ELF header is contained in the same memory than the
> > > kdump kernel. When the kdump kernel starts, the ELF header has to be
> > > saved from being overwritten (as kernel and ramdisk). I get the address
> > > from the "elfcorehdr=" kernel parameter. How will I get the size?
> > 
> > By parsing the ELF header. It will give you information about how many
> > program headers and notes are there, their sizes and locations etc.
> 
> The only thing we need is the size of the preallocated header that is in
> kdump memory. All other architectures seem to pass this information
> somehow with different mechanisms to the kdump kernel (memmap kernel
> parameter, boot parameters, etc.). Why should *we* parse the ELF header?

ELF headers and memmap parameters are communicating two different pieces
of information to second kenrel.

- memap tells what memory second kernel can use to boot.
- ELF headers tell what memory areas first kernel was using and using
  that information how to construct ELF headers for /proc/vmcore interface
  in second kernel. On x86, ELF headers also communicate where the saved
  cpu state is for the first kernel.

Arch independent code in kdump kenrel (fs/proc/vmcore.c) is parsing those
ELF headers to export /proc/vmcore. So if you set up the headers right
you get that arch independent code for free without any changes to generic
code.

*Why should you not try to use what is avaialble already*

> 
> > When kexec-tools loads ELF headers, it knows what's the total size of
> > ELF headers and it removes that chunk of memory from the memory map
> > passed to second kernel with memmap= options. IOW, some memory out
> > of reserved region is not usable by second kernel because we have
> > stored information in that memory. Kdump kernel maps that memory and
> > gets to read the ELF headers.
> > 
> > So you shall have to do something similar where you need to tell second
> > kernel what memory areas it can use for boot and remove ELF header
> > memory area from the map.
> 
> So if we do that, why should we parse the ELF header?

To know three things.

- Memory areas being used by first kernel.
- Cpu states at the time of crash of first kernel.
- Some config options exported by first kernel with the help of ELF notes.

fs/proc/vmcore.c already does it for you. You just need to make sure that
you tell it following.

- Where to find the headers in memory (elfcorehdr=)
- A way to map that memory and access contents.
- Make sure these headers are not overwritten by newly booted kernel.

[..]
> > It is possible. Even in x86, we prepare a block of information, one
> > 4K page and fill lots of x86 boot protocol information.
> > 
> > Look at.
> > 
> > kexec-tools/include/x86/x86-linux.h
> > kexec-tools/kexec/arch/i386/x86-linux-setup.c
> > 
> > Above header information contains information about e820 memory map also
> > and we fill that map info for normal kexec (fastboot, not kdump) also and
> > that's how second kernel comes to know about memory map of system.
> > 
> > I think one could possibly truncate the same map for kdump kernel to
> > tell second kernel about the memory to use. But IIRC, original memory
> > map is also used to determine max_pfn present in first kernel so that
> > in second kernel we don't try to map a memory beyond that and access
> > it, etc. Hence it was decided to leave it that way and pass the memory
> > map for second kernel on command line. 
> > 
> > So its possible that IA64 is doing preparing boot protocal specific
> > block and passing all the releavant information in that block instead
> > of making use of commnad line.
> 
> Just to come back to your initial argumentation against our meminfo
> approach: It looks like that there are already other mechanisms besides
> of ELF-header and kernel parameters to pass information to the kdump
> kernel. Where is the conceptional difference to our meminfo interface?

That's well defined boot-loader and kernel protocol to on x86. kexec-tools
is just another boot loader and it uses that block to fill the information
a normal boot loader will do.

So if you have s390 specific boot loader/kernel protocol and if you extend
that, I think that should still be fine. Just keep the code in kexec-tools
for filling up the information which s390 specific code can parse. In
that case we should not require any generic changes to either kexec-tools
or kernel code. All the protocol specific details should be well hidden
in arch specific code.

Thanks
Vivek