[PATCH 08/10] arm64/kexec: Add core kexec support
Mark Rutland
mark.rutland at arm.com
Mon Nov 17 08:38:26 PST 2014
On Thu, Nov 13, 2014 at 02:19:48AM +0000, Geoff Levand wrote:
> Hi Mark,
Hi Geoff,
> On Fri, 2014-10-24 at 11:28 +0100, Mark Rutland wrote:
> > > +++ b/arch/arm64/Kconfig
> > > @@ -313,6 +313,15 @@ config ARCH_HAS_CACHE_LINE_SIZE
> > >
> > > source "mm/Kconfig"
> > >
> > > +config KEXEC
> > > + depends on (!SMP || PM_SLEEP_SMP)
> >
> > In its current state this also depends on !KVM && !EFI (technically you
> > could detect those cases at runtime, but I don't see that in this
> > series).
>
> A kernel built with CONFIG_EFI is OK if run on a non-EFI system or
> without using a system's EFI support. I added a patch that adds
> runtime checks in the kexec_load syscall path to print a message
> and return failure for situations where KVM or EFI won't work.
>
> > > +/**
> > > + * machine_kexec_prepare - Prepare for a kexec reboot.
> > > + *
> > > + * Called from the core kexec code when a kernel image is loaded.
> > > + */
> > > +
> > > +int machine_kexec_prepare(struct kimage *image)
> > > +{
> > > + const struct kexec_segment *dtb_seg = kexec_find_dtb_seg(image);
> > > +
> > > + if (!dtb_seg)
> > > + pr_warn("%s: No device tree segment found.\n", __func__);
> > > +
> > > + arm64_kexec_dtb_addr = dtb_seg ? dtb_seg->mem : 0;
> > > + arm64_kexec_kimage_start = image->start;
> > > +
> > > + return 0;
> > > +}
> >
> > I thought all of the DTB handling was moving to purgatory?
>
> Non-purgatory booting is needed for kexec-lite. We can do
> this simple check here which optionally sets x0 to the dtb
> address to support that. The other solution is to have a
> trampoline in kexec-lite that sets x0 (basically an absolute
> minimal purgatory), but I think to do it here is nicer, and
> is also the same way that the arm arch code does it.
>
> Maybe removing this pr_warn message and just relying on the
> kexec_image_info() output would be better.
I mentioned previously that I don't think the "kexec-lite" approach is a
good one, especially if we're going to have userspace purgatory code
anyway. It embeds a policy w.r.t. the segment handling within the
kernel, on the assumption of a specific use-case for what is a more
general mechanism.
Unfortunately secureboot with kexec_file_load will require a kernelspace
purgatory and likely special DT handling, but it's already a far more
limited interface.
> > > +/**
> > > + * kexec_list_flush - Helper to flush the kimage list to PoC.
> > > + */
> > > +
> > > +static void kexec_list_flush(unsigned long kimage_head)
> > > +{
> > > + void *dest;
> > > + unsigned long *entry;
> > > +
> > > + for (entry = &kimage_head, dest = NULL; ; entry++) {
> > > + unsigned int flag = *entry &
> > > + (IND_DESTINATION | IND_INDIRECTION | IND_DONE |
> > > + IND_SOURCE);
> > > + void *addr = phys_to_virt(*entry & PAGE_MASK);
> > > +
> > > + switch (flag) {
> > > + case IND_INDIRECTION:
> > > + entry = (unsigned long *)addr - 1;
> > > + __flush_dcache_area(addr, PAGE_SIZE);
> > > + break;
> > > + case IND_DESTINATION:
> > > + dest = addr;
> > > + break;
> > > + case IND_SOURCE:
> > > + __flush_dcache_area(addr, PAGE_SIZE);
> > > + dest += PAGE_SIZE;
> > > + break;
> > > + case IND_DONE:
> > > + return;
> > > + default:
> > > + break;
> >
> > Can an image ever have no flags? Given the presence of IND_NONE I'd
> > assume not, so this looks like a candidate for a BUG().
>
> Sure, I guess things will blow up before it ever gets here though.
I would hope so. If we trigger the BUG() we know otherwise.
>
> >
> > > + }
> > > + }
> > > +}
> > > +
> > > +/**
> > > + * machine_kexec - Do the kexec reboot.
> > > + *
> > > + * Called from the core kexec code for a sys_reboot with LINUX_REBOOT_CMD_KEXEC.
> > > + */
> > > +
> > > +void machine_kexec(struct kimage *image)
> > > +{
> > > + phys_addr_t reboot_code_buffer_phys;
> > > + void *reboot_code_buffer;
> > > +
> > > + BUG_ON(num_online_cpus() > 1);
> > > +
> > > + arm64_kexec_kimage_head = image->head;
> > > +
> > > + reboot_code_buffer_phys = page_to_phys(image->control_code_page);
> > > + reboot_code_buffer = phys_to_virt(reboot_code_buffer_phys);
> > > +
> > > + /*
> > > + * Copy relocate_new_kernel to the reboot_code_buffer for use
> > > + * after the kernel is shut down.
> > > + */
> > > +
> > > + memcpy(reboot_code_buffer, relocate_new_kernel,
> > > + relocate_new_kernel_size);
> >
> > Can we get rid of the line gaps between comments and the single function
> > calls they apply to, please? I realise it's a minor thing, but this
> > looks rather inconsistent with the rest of arch/arm64/.
>
> checkpatch doesn't seem to mind, but sure, I can do that.
Cheers.
> > > +
> > > + /* Flush the reboot_code_buffer in preparation for its execution. */
> > > +
> > > + __flush_dcache_area(reboot_code_buffer, relocate_new_kernel_size);
> >
> > That code should already be at the PoC per the boot protocol (the entire
> > kernel image should have been clean to the PoC at boot, so the
> > instructions forming relocate_new_kernel are globally visible).
> >
> > From the looks of it you only need to flush the variables at the very
> > end.
>
> We copy the relocate_new_kernel routine to reboot_code_buffer, which is
> a buffer allocated by the kexec core with alloc_pages(). That copy of
> relocate_new_kernel is what we are flushing here.
>
> I believe we need to flush that buffer out to PoC so we can execute
> the code it contains after cpu_soft_restart().
Apologies. I'd evidently confused myself here regarding what was being
flushed.
>
> > > --- /dev/null
> > > +++ b/arch/arm64/kernel/relocate_kernel.S
> > > @@ -0,0 +1,184 @@
> > > +/*
> > > + * kexec for arm64
> > > + *
> > > + * Copyright (C) Linaro.
> > > + *
> > > + * This program is free software; you can redistribute it and/or modify
> > > + * it under the terms of the GNU General Public License version 2 as
> > > + * published by the Free Software Foundation.
> > > + */
> > > +
> > > +#include <asm/assembler.h>
> > > +#include <asm/kexec.h>
> > > +#include <asm/memory.h>
> > > +#include <asm/page.h>
> > > +#include <asm/proc-macros.S>
> > > +
> > > +/* The list entry flags. */
> > > +
> > > +#define IND_DESTINATION_BIT 0
> > > +#define IND_INDIRECTION_BIT 1
> > > +#define IND_DONE_BIT 2
> > > +#define IND_SOURCE_BIT 3
> >
> > As previously, I think these need to be moved into a common header, and
> > defined in terms of the existing IND_* macros (or vice-versa). I believe
> > you had a patch doing so; what's the status of that?
>
> Still working to get it merged:
>
> https://lkml.org/lkml/2014/11/12/675
Ok. Let's hope that goes through soon.
>
> > > +/*
> > > + * relocate_new_kernel - Put a 2nd stage kernel image in place and boot it.
> > > + *
> > > + * The memory that the old kernel occupies may be overwritten when coping the
> > > + * new image to its final location. To assure that the relocate_new_kernel
> > > + * routine which does that copy is not overwritten all code and data needed
> > > + * by relocate_new_kernel must be between the symbols relocate_new_kernel and
> > > + * relocate_new_kernel_end. The machine_kexec() routine will copy
> > > + * relocate_new_kernel to the kexec control_code_page, a special page which
> > > + * has been set up to be preserved during the copy operation.
> > > + */
> > > +
> > > +.globl relocate_new_kernel
> > > +relocate_new_kernel:
>
> ...
>
> > > +
> > > + /* start_new_image */
> > > +
> > > + ldr x4, arm64_kexec_kimage_start
> > > + ldr x0, arm64_kexec_dtb_addr
> > > + mov x1, xzr
> > > + mov x2, xzr
> > > + mov x3, xzr
> > > + br x4
> >
> > This last part should be in userspace-provided purgatory. If you have
> > purgatory code which does this then we should be able to rely on that,
> > and we don't have to try to maintain this DTB handling in kernelspace
> > (which I suspect may become painful as the boot protocol evolves).
>
> I think the putting the dtb address in x0 is already fixed. There are
> users with firmware that does this and any change to the boot protocol
> will have to work with it.
Sure, but that is the _Linux_ boot protocol, and the Kconfig description
of kexec stats "you can start any kernel with it, not just Linux". Why
should we embed Linux-specific details into a supposedly generic
mechanism?
We may also extend the boot protocol, and I would rather not have to
manage the complexity of each possible extension within the kernel,
especially given that the only context we can pass in kexec is segments.
> As I mentioned above, we need a solution for non-purgatory re-boot and I
> think this is the best way.
Why do we need a solution for "non-purgatory re-boot"? As far as I can
see this is a non-problem.
Thanks,
Mark.
More information about the linux-arm-kernel
mailing list