[PATCH 1/4 -mm] kexec based hibernation -v7 : kexec jump
Huang, Ying
ying.huang at intel.com
Tue Dec 11 03:51:36 EST 2007
On Mon, 2007-12-10 at 14:55 -0500, Vivek Goyal wrote:
> On Fri, Dec 07, 2007 at 03:53:30PM +0000, Huang, Ying wrote:
> > This patch implements the functionality of jumping between the kexeced
> > kernel and the original kernel.
> >
>
> Hi,
>
> I am just going through your patches and trying to understand it. Don't
> understand many things. Asking is easy so here you go...
>
> > To support jumping between two kernels, before jumping to (executing)
> > the new kernel and jumping back to the original kernel, the devices
> > are put into quiescent state, and the state of devices and CPU is
> > saved. After jumping back from kexeced kernel and jumping to the new
> > kernel, the state of devices and CPU are restored accordingly. The
> > devices/CPU state save/restore code of software suspend is called to
> > implement corresponding function.
> >
>
> I need jumping back to restore a already hibernated kernel image? Can
> you please tell little more about jumping back and why it is needed?
Now, the jumping back is used to implement "kexec based hibernation",
which uses kexec/kdump to save the memory image of hibernated kernel
during hibernating, and uses /dev/oldmem to restore the memory image of
hibernated kernel and jump back to the hibernated kernel to continue
run.
The other usage model maybe include:
- Dump the system memory image then continue to run, that is, get some
memory snapshot of system during system running.
- Cooperative multi-task of different OS. You can load another OS (B)
from current OS (A), and jump between the two OSes upon needed.
- Call some code (such as firmware, etc) in physical mode.
> > To support jumping without reserving memory. One shadow backup page
> > (source page) is allocated for each page used by new (kexeced) kernel
> > (destination page). When do kexec_load, the image of new kernel is
> > loaded into source pages, and before executing, the destination pages
> > and the source pages are swapped, so the contents of destination pages
> > are backupped. Before jumping to the new (kexeced) kernel and after
> > jumping back to the original kernel, the destination pages and the
> > source pages are swapped too.
> >
>
> Ok, so due to swapping of source and destination pages first kernel's data
> is still preserved. How do I get the dynamic memory required for second
> kernel boot (without writing first kernel's data)?
All dynamic memory required for second kernel should be "loaded" by
sys_kexec_load in first kernel. For example, not only the Linux kernel
should be loaded at 1M, the memory 0~16M (exclude kernel) should be
"loaded" (all zero) by /sbin/kexec via sys_kexec_load too.
> > A jump back protocol for kexec is defined and documented. It is an
> > extension to ordinary function calling protocol. So, the facility
> > provided by this patch can be used to call ordinary C function in real
> > mode.
> >
> > A set of flags for sys_kexec_load are added to control which state are
> > saved/restored before/after real mode code executing. For example, you
> > can specify the device state and FPU state are saved/restored
> > before/after real mode code executing.
> >
> > The states (exclude CPU state) save/restore code can be overridden
> > based on the "command" parameter of kexec jump. Because more states
> > need to be saved/restored by hibernating/resuming.
> >
> > Signed-off-by: Huang Ying <ying.huang at intel.com>
> >
> > ---
> > Documentation/i386/jump_back_protocol.txt | 103 ++++++++++++++
> > arch/powerpc/kernel/machine_kexec.c | 2
> > arch/ppc/kernel/machine_kexec.c | 2
> > arch/sh/kernel/machine_kexec.c | 2
> > arch/x86/kernel/machine_kexec_32.c | 88 +++++++++---
> > arch/x86/kernel/machine_kexec_64.c | 2
> > arch/x86/kernel/relocate_kernel_32.S | 214 +++++++++++++++++++++++++++---
> > include/asm-x86/kexec_32.h | 39 ++++-
> > include/linux/kexec.h | 40 +++++
> > kernel/kexec.c | 188 ++++++++++++++++++++++++++
> > kernel/power/Kconfig | 2
> > kernel/sys.c | 35 +++-
> > 12 files changed, 648 insertions(+), 69 deletions(-)
> >
> > --- a/arch/x86/kernel/machine_kexec_32.c
> > +++ b/arch/x86/kernel/machine_kexec_32.c
> > @@ -20,6 +20,7 @@
> > #include <asm/cpufeature.h>
> > #include <asm/desc.h>
> > #include <asm/system.h>
> > +#include <asm/cacheflush.h>
> >
> > #define PAGE_ALIGNED __attribute__ ((__aligned__(PAGE_SIZE)))
> > static u32 kexec_pgd[1024] PAGE_ALIGNED;
> > @@ -83,10 +84,14 @@ static void load_segments(void)
> > * reboot code buffer to allow us to avoid allocations
> > * later.
> > *
> > - * Currently nothing.
> > + * Turn off NX bit for control page.
> > */
> > int machine_kexec_prepare(struct kimage *image)
> > {
> > + if (nx_enabled) {
> > + change_page_attr(image->control_code_page, 1, PAGE_KERNEL_EXEC);
> > + global_flush_tlb();
> > + }
> > return 0;
> > }
> >
> > @@ -96,25 +101,59 @@ int machine_kexec_prepare(struct kimage
> > */
> > void machine_kexec_cleanup(struct kimage *image)
> > {
> > + if (nx_enabled) {
> > + change_page_attr(image->control_code_page, 1, PAGE_KERNEL);
> > + global_flush_tlb();
> > + }
> > +}
> > +
> > +void machine_kexec(struct kimage *image)
> > +{
> > + machine_kexec_call(image, NULL, 0);
> > }
> >
> > /*
> > * Do not allocate memory (or fail in any way) in machine_kexec().
> > * We are past the point of no return, committed to rebooting now.
> > */
> > -NORET_TYPE void machine_kexec(struct kimage *image)
> > +int machine_kexec_vcall(struct kimage *image, unsigned long *ret,
> > + unsigned int argc, va_list args)
> > {
> > unsigned long page_list[PAGES_NR];
> > void *control_page;
> > + asmlinkage NORET_TYPE void
> > + (*relocate_kernel_ptr)(unsigned long indirection_page,
> > + unsigned long control_page,
> > + unsigned long start_address,
> > + unsigned int has_pae) ATTRIB_NORET;
> >
> > /* Interrupts aren't acceptable while we reboot */
> > local_irq_disable();
> >
> > control_page = page_address(image->control_code_page);
> > - memcpy(control_page, relocate_kernel, PAGE_SIZE);
> > + memcpy(control_page, relocate_page, PAGE_SIZE/2);
> > + KCALL_MAGIC(control_page) = 0;
> >
>
> Is 2K sufficient for all the code in relocate_kernel_32.S? What's the
> current size?
The current size is 0x2d7 (727). I got it though objdump,
machine_crash_shutdown - relocate_page. I think we have enough space.
> > + if (image->preserve_cpu) {
> > + unsigned int i;
> > + KCALL_MAGIC(control_page) = KCALL_MAGIC_NUMBER;
> > + KCALL_ARGC(control_page) = argc;
> > + for (i = 0; i < argc; i++)
> > + KCALL_ARGS(control_page)[i] = \
> > + va_arg(args, unsigned long);
> > +
> > + if (kexec_call_save_cpu(control_page)) {
> > + image->start = KCALL_ENTRY(control_page);
>
> Who fills the entry point at offset 0x200?
The entry point is filled by assembler code in reloate_kernel_32.S upon
jumping back. You can find it by "grep ENTRY relocate_kernel_32.S".
>
> [..]
> > extern int machine_kexec_prepare(struct kimage *image);
> > extern void machine_kexec_cleanup(struct kimage *image);
> > extern asmlinkage long sys_kexec_load(unsigned long entry,
> > unsigned long nr_segments,
> > struct kexec_segment __user *segments,
> > unsigned long flags);
> > +extern int kexec_call(struct kimage *image, unsigned long *ret,
> > + unsigned int argc, ...);
>
> Who is using kexec_call(). I can't seem to locate the caller of it.
There is no user of kexec_call() now. But I think it may be useful as a
physical mode caller for some firmware code.
Best Regards,
Huang Ying
More information about the kexec
mailing list