[PATCH v18 03/13] arm64/kexec: Add core kexec support
Geoff Levand
geoff at infradead.org
Thu Jun 16 15:41:06 PDT 2016
On Wed, 2016-06-15 at 18:10 +0100, James Morse wrote:
> On 09/06/16 21:08, Geoff Levand wrote:
> > +++ b/arch/arm64/kernel/machine_kexec.c
> > @@ -0,0 +1,185 @@
> > +/*
> > + * kexec for arm64
> > + *
> > + * Copyright (C) Linaro.
> > + * Copyright (C) Huawei Futurewei Technologies.
> > + *
> > + * This program is free software; you can redistribute it and/or modify
> > + * it under the terms of the GNU General Public License version 2 as
> > + * published by the Free Software Foundation.
> > + */
> > +
> > +#include
>
> We don't have/need highmem on arm64. The kmap()/kunmap() calls just obscure what
> is going on.
>
>
> > +#include
> > +#include
>
> What do you need of_fdt.h for? I guess this should be in patch 4.
>
>
> > +#include
>
> The control page was already allocated, I can't see anything else being
> allocated... What do you need slab.h for?
>
>
> > +#include
> > +#include
>
> User space access? I guess this should be in patch 4.
>
>
> > +
> > +#include
> > +#include
> > +#include
> > +#include
>
> I can't see anything in system_misc.h that you are using in here.
I cleaned up all these includes.
> > + * kexec_list_flush - Helper to flush the kimage list to PoC.
> > + */
> > +static void kexec_list_flush(struct kimage *kimage)
> > +{
> > +> > > > kimage_entry_t *entry;
> > +> > > > unsigned int flag;
> > +
> > +> > > > for (entry = &kimage->head, flag = 0; flag != IND_DONE; entry++) {
> > +> > > > > > void *addr = kmap(phys_to_page(*entry & PAGE_MASK));
> > +
> > +> > > > > > flag = *entry & IND_FLAGS;
> > +
> > +> > > > > > switch (flag) {
> > +> > > > > > case IND_INDIRECTION:
> > +> > > > > > > > entry = (kimage_entry_t *)addr - 1;
>
> This '-1' is so that entry points before the first entry of the new table,
> and is un-done by entry++ next time round the loop...
> If I'm right, could you add a comment to that effect? It took me a little while
> to work out!
I added a comment.
> kexec_core.c has a snazzy macro: for_each_kimage_entry(), its a shame its not in
> a header file.
> This loop does the same but with two variables instead of three. These
> IN_INDIRECTION pages only appear at the end of a list, this list-walking looks
> correct.
>
>
> > +> > > > > > > > __flush_dcache_area(addr, PAGE_SIZE);
>
> So if we find an indirection pointer, we switch entry to the new page, and clean
> it to the PoC, because later we walk this list with the MMU off.
>
> But what cleans the very first page?
I don't think this routine was doing the quite the right thing. The
arm64_relocate_new_kernel routine uses the list (the entry's), and
the second stage kernel buffers (the IND_SOURCE's). Those two things
are what should be flushed here.
> > +> > > > > > > > break;
> > +> > > > > > case IND_DESTINATION:
> > +> > > > > > > > break;
> > +> > > > > > case IND_SOURCE:
> > +> > > > > > > > __flush_dcache_area(addr, PAGE_SIZE);
> > +> > > > > > > > break;
> > +> > > > > > case IND_DONE:
> > +> > > > > > > > break;
> > +> > > > > > default:
> > +> > > > > > > > BUG();
>
> Unless you think its less readable, you could group the clauses together:
Takahiro found a bug when CONFIG_SPARSEMEM_VMEMMAP=n, and this code
has now been reworked.
> > > > > > case IND_INDIRECTION:
> > > > > > > > entry = (kimage_entry_t *)addr - 1;
> > > > > > case IND_SOURCE:
> > > > > > > > __flush_dcache_area(addr, PAGE_SIZE);
> > > > > > case IND_DESTINATION:
> > > > > > case IND_DONE:
> > > > > > > > break;
> diff --git a/arch/arm64/kernel/relocate_kernel.S b/arch/arm64/kernel/relocate_kernel.S
> > new file mode 100644
> > index 0000000..e380db3
> > --- /dev/null
> > +++ b/arch/arm64/kernel/relocate_kernel.S
> > @@ -0,0 +1,131 @@
> > +.globl arm64_relocate_new_kernel
> > +arm64_relocate_new_kernel:
>
> All the other asm functions use ENTRY(), which would do the .globl and alignment
> for you. (You would need a ENDPROC(arm64_relocate_new_kernel) too.)
Sure.
> > +
> > +> > > > /* Setup the list loop variables. */
> > +> > > > mov> > > > x18, x1> > > > > > > > > > /* x18 = kimage_start */
> > +> > > > mov> > > > x17, x0> > > > > > > > > > /* x17 = kimage_head */
> > +> > > > dcache_line_size x16, x0> > > > > > /* x16 = dcache line size */
> > +> > > > mov> > > > x15, xzr> > > > > > > > /* x15 = segment start */
>
> What uses this 'segment start'?
That is left over from when we booted without purgatory (as
the arm arch does).
> > +> > > > mov> > > > x14, xzr> > > > > > > > /* x14 = entry ptr */
> > +> > > > mov> > > > x13, xzr> > > > > > > > /* x13 = copy dest */
> > +
> > +> > > > /* Clear the sctlr_el2 flags. */
> > +> > > > mrs> > > > x0, CurrentEL
> > +> > > > cmp> > > > x0, #CurrentEL_EL2
> > +> > > > b.ne> > > > 1f
> > +> > > > mrs> > > > x0, sctlr_el2
> > +> > > > ldr> > > > x1, =SCTLR_ELx_FLAGS
> > +> > > > bic> > > > x0, x0, x1
> > +> > > > msr> > > > sctlr_el2, x0
> > +> > > > isb
> > +1:
> > +
> > +> > > > /* Check if the new image needs relocation. */
> > +> > > > cbz> > > > x17, .Ldone
>
> Does this happen? Do we ever come across an empty slot in the tables?
>
> kimage_terminate() adds the IND_DONE entry, so we should never see an empty
> slot. kexec_list_flush() would BUG() on this too, and we call that
> unconditionally on the way in here.
I put that in just in case, but never checked if it would
ever actually happen. I can take it out.
> > +> > > > tbnz> > > > x17, IND_DONE_BIT, .Ldone
> > +
> > +.Lloop:
> > +> > > > and> > > > x12, x17, PAGE_MASK> > > > > > /* x12 = addr */
> > +
> > +> > > > /* Test the entry flags. */
> > +.Ltest_source:
> > +> > > > tbz> > > > x17, IND_SOURCE_BIT, .Ltest_indirection
> > +
> > +> > > > /* Invalidate dest page to PoC. */
> > +> > > > mov x0, x13
> > +> > > > add x20, x0, #PAGE_SIZE
> > +> > > > sub x1, x16, #1
> > +> > > > bic x0, x0, x1
> > +2:> > > > dc ivac, x0
>
> This relies on an IND_DESTINATION being found first for x13 to be set to
> something other than 0. I guess if kexec-core hands us a broken list, all bets
> are off!
Yes, assumed to be IND_DESTINATION.
> > +
> > +.Ldone:
>
> /* wait for writes from copy_page to finish */
Added.
> > + dsb nsh
> > +> > > > ic> > > > iallu
> > +> > > > dsb> > > > nsh
> > +> > > > isb
> > +
> > +> > > > /* Start new image. */
> > +> > > > mov> > > > x0, xzr
> > +> > > > mov> > > > x1, xzr
> > +> > > > mov> > > > x2, xzr
> > +> > > > mov> > > > x3, xzr
> > +> > > > br> > > > x18
> > +
> > +.ltorg
> > +
> > +.align 3> > > > /* To keep the 64-bit values below naturally aligned. */
> > +
> > +.Lcopy_end:
> > +.org> > > > KEXEC_CONTROL_PAGE_SIZE
>
> Why do we need to pad up to KEXEC_CONTROL_PAGE_SIZE?
> In machine_kexec() we only copy arm64_relocate_new_kernel_size bytes, so it
> shouldn't matter what is here. As far as I can see we don't even access it.
This is to check if arm64_relocate_new_kernel gets too
big. The assembler should give an error if the location
counter is set backwards.
-Geoff
More information about the kexec
mailing list