[PATCH 08/16] arm64/kexec: Add core kexec support
Geoff Levand
geoff at infradead.org
Mon Nov 2 16:30:16 PST 2015
Hi James,
On Fri, 2015-10-30 at 16:29 +0000, James Morse wrote:
> On 20/10/15 00:38, Geoff Levand wrote:
> > +config KEXEC
> > +> > > > depends on (!SMP || PM_SLEEP_SMP)
>
> Commit 4b3dc9679cf7 got rid of '!SMP'.
Fixed for v11.
> > - * Copyright (C) 2015 Huawei Futurewei Technologies.
> > + * Copyright (C) Huawei Futurewei Technologies.
>
> Move this hunk into the patch that adds the file?
Was fixed in v10.2.
> > +++ b/arch/arm64/kernel/relocate_kernel.S
> If I've followed all this through properly:
>
> With KVM - mmu+caches are configured, but then disabled by 'kvm: allows kvm
> cpu hotplug'. This 'arm64_relocate_new_kernel' function then runs at EL2
> with M=0, C=0, I=0.
>
> Without KVM - when there is no user of EL2, the mmu+caches are left in
> whatever state the bootloader (or efi stub) left them in. From
> Documentation/arm64/booting.txt:
> > Instruction cache may be on or off.
> and
> > System caches which respect the architected cache maintenance by VA
> > operations must be configured and may be enabled.
>
> So 'arm64_relocate_new_kernel' function could run at EL2 with M=0, C=?, I=?.
>
> I think this means you can't guarantee anything you are copying below
> actually makes it through the caches - booting secondary processors may get
> stale values.
>
> The EFI stub disables the M and C bits when booted at EL2 with uefi - but
> it leaves the instruction cache enabled. You only clean the
> reboot_code_buffer from the data cache, so there may be stale values in the
> instruction cache.
>
> I think you need to disable the i-cache at EL1. If you jump to EL2, I think
> you need to disable the I/C bits there too - as you can't rely on the code
> in 'kvm: allows kvm cpu hotplug' to do this in a non-kvm case.
For consistency across all code paths, we could put in something like this:
+ /* Clear SCTLR_ELx_FLAGS. */
+ mrs x0, CurrentEL
+ cmp x0, #CurrentEL_EL2
+ b.ne 1f
+ mrs x0, sctlr_el2
+ ldr x1, =SCTLR_EL2_FLAGS
+ bic x0, x0, x1
+ msr sctlr_el2, x0
+ isb
+ b 2f
+1: mrs x0, sctlr_el1
+ ldr x1, =SCTLR_EL2_FLAGS
+ bic x0, x0, x1
+ msr sctlr_el1, x0
+ isb
> > +.Ldone:
> > +> > > > dsb> > > > sy
> > +> > > > isb
> > +> > > > ic> > > > ialluis
> > +> > > > dsb> > > > sy
>
> Why the second dsb?
I removed the first isb as Mark suggested.
>
> > +> > > > isb
> > +
> > +> > > > /* Start new image. */
> > +> > > > ldr> > > > x4, .Lkimage_start
> > +> > > > mov> > > > x0, xzr
> > +> > > > mov> > > > x1, xzr
> > +> > > > mov> > > > x2, xzr
> > +> > > > mov> > > > x3, xzr
>
> Once the kexec'd kernel is booting, I get:
> > WARNING: x1-x3 nonzero in violation of boot protocol:
> > x1: 0000000080008000
> > x2: 0000000000000020
> > x3: 0000000000000020
> > This indicates a broken bootloader or old kernel
>
> Presumably this 'kimage_start' isn't pointing to the new kernel, but the
> purgatory code, (which comes from user-space?). (If so what are these xzr-s
> for?)
The warning was from the arm64 purgatory in kexec-tools, now fixed.
We don't need to zero the registers anymore. At one time I had
an option where the kernel found the dtb section and jumped
directly to the new image as the 32 bit arm kernel does.
> +/* The machine_kexec routine sets these variables via offsets from
> > + * arm64_relocate_new_kernel.
> > + */
> > +
> > +/*
> > + * .Lkimage_start - Copy of image->start, the entry point of the new
> > + * image.
> > + */
> > +.Lkimage_start:
> > +> > > > .quad> > > > 0x0
> > +
> > +/*
> > + * .Lkimage_head - Copy of image->head, the list of kimage entries.
> > + */
> > +.Lkimage_head:
> > +> > > > .quad> > > > 0x0
> > +
>
> I assume these .quad-s are used because you can't pass the values in via
> registers - due to the complicated soft_restart(). Given you are the only
> user, couldn't you simplify it to do all the disabling in
> arm64_relocate_new_kernel?
I moved some things from cpu_reset to arm64_relocate_new_kernel, but
from what Takahiro has said, to support a modular kvm some of the CPU
shutdown code will be shared. Maybe we can look into simplifying things
once work on modular kvm is started.
>
> From 'kexec -e' to the first messages from the new kernel takes ~1 minute
> on Juno, Did you see a similar delay? Or should I go looking for what I've
> configured wrong!?
As Pratyush has mentioned this is most likely due to the dcaches
being disabled.
> (Copying code with the mmu+caches on, then cleaning to PoC was noticeably
> faster for hibernate)
>
>
> I've used this series for kexec-ing between 4K and 64K page_size kernels on
> Juno.
Thanks for testing.
-Geoff
More information about the kexec
mailing list