[PATCH] arm64/relocate_kernel: remove redundant but misleading code

Pingfan Liu kernelfans at gmail.com
Fri Aug 7 10:14:41 EDT 2020


Hi Morse,

Appreciate for your patient explanation. I have no experience of
arm/kvm and after a quick through, I still have some questions. Please
correct me if I am wrong.

On Thu, Aug 6, 2020 at 8:20 PM James Morse <james.morse at arm.com> wrote:
>
> Hi Liu,
>
> On 06/08/2020 09:26, Pingfan Liu wrote:
> > The kexec switch sequence looks like the following:
> >   SYM_CODE_START(__cpu_soft_restart)
> >     ...
> >     pre_disable_mmu_workaround
> >     msr       sctlr_el1, x12
> >     ...
> >     br        x8
> >
> >   SYM_CODE_START(arm64_relocate_new_kernel)
> >     ...
> >     pre_disable_mmu_workaround
> >     msr       sctlr_el2, x0
> >     ...
>
> > "msr sctlr_el2, x0" is misleading, because "br x8" jump to a physical
> > address, which has no entry in idmap.
>
> Even better: this code run from a copy allocated by kexec, its not in the idmap either.
Yes, looks better.
>
> See the memcpy() in machine_kexec().
>
>
> > It implies that MMU has already been fully off after "msr sctlr_el1, x12".
>
>
> > And according to "D12.2.101 SCTLR_EL2, System Control Register (EL2)" in
> > "ARM Architecture Reference Manual", actually, EL2&0 host accesses
> > to SCTLR_EL2 when using mnemonic SCTLR_EL1.
>
> Only when HCR_EL2.E2H is enabled. If linux booted at EL2 on a non-VHE system, SCTLR_EL1
> and SCTLR_EL2 are different registers, both of which are managed by linux/KVM.
Yeah. I have realized these factors when composing this patch, but not
sure anything is missing.

Kexec on arm is introduced by the following two commits
d28f6df arm64/kexec: Add core kexec support
f9076ec arm64: Add back cpu reset routines

And at d28f6df, the code snippet is
in kernel/cpu-reset.S,
ENTRY(__cpu_soft_restart)
        /* Clear sctlr_el1 flags. */
        mrs     x12, sctlr_el1
        ldr     x13, =SCTLR_ELx_FLAGS
        bic     x12, x12, x13
        msr     sctlr_el1, x12
        isb

        cbz     x0, 1f                          // el2_switch?
        mov     x0, #HVC_SOFT_RESTART
        hvc     #0                              // no return
        //--> as the note, I think for both EL1&0 host and guest, they
will never resume the following code
        //--> so in the original patch, the rest code is only for EL2&0 host

1:      mov     x18, x1                         // entry
        mov     x0, x2                          // arg0
        mov     x1, x3                          // arg1
        mov     x2, x4                          // arg2
        br      x18
ENDPROC(__cpu_soft_restart)

And in kernel/hyp-stub.S
el1_sync:
        mrs     x30, esr_el2
        lsr     x30, x30, #ESR_ELx_EC_SHIFT

        cmp     x30, #ESR_ELx_EC_HVC64
        b.ne    9f                              // Not an HVC trap

        cmp     x0, #HVC_GET_VECTORS
        b.ne    1f
        mrs     x0, vbar_el2
        b       9f

1:      cmp     x0, #HVC_SET_VECTORS
        b.ne    2f
        msr     vbar_el2, x1
        b       9f

2:      cmp     x0, #HVC_SOFT_RESTART
        b.ne    3f
        mov     x0, x2
        mov     x2, x4
        mov     x4, x1
        mov     x1, x3
        br      x4                              // no return

        /* Someone called kvm_call_hyp() against the hyp-stub... */
3:      mov     x0, #ARM_EXCEPTION_HYP_GONE

9:      eret
ENDPROC(el1_sync)

I think for a soft restart, it jumps to the new kernel by 'br x4'. But
it seems a bug, which does not clear I+C bits, not disable EL2
translation regime.

On the other hand, if it wants to resume execution immediately after
"hvc #0" in ENTRY(__cpu_soft_restart), it should eret by the
modification of "spsr_el2.M[3:0] = PSR_MODE_EL2h", the code originates
from EL1, but the rest code tries to modify EL2 registers.

By this way, the code can do the exact things you mentioned. But there
seems to be a missing of such mechanisms.
>
>
> > Hence removing the redundant but misleading code.
>
> This isn't the reason its redundant...
>
>
> > diff --git a/arch/arm64/kernel/cpu-reset.S b/arch/arm64/kernel/cpu-reset.S
> > index 4a18055..37721eb 100644
> > --- a/arch/arm64/kernel/cpu-reset.S
> > +++ b/arch/arm64/kernel/cpu-reset.S
> > @@ -35,6 +35,10 @@ SYM_CODE_START(__cpu_soft_restart)
> >       mov_q   x13, SCTLR_ELx_FLAGS
> >       bic     x12, x12, x13
> >       pre_disable_mmu_workaround
> > +     /*
> > +      * either disable EL1&0 translation regime or disable EL2&0 translation
> > +      * regime if HCR_EL2.E2H == 1
> > +      */>    msr     sctlr_el1, x12
> >       isb
>
> On a VHE system, yes the cpu-reset.S disables EL2&0 by writing to SCTLR_EL1
>
> But on a non-VHE system, that same code disabled EL1&0. cup-reset.S goes on to call
> HVC_SOFT_RESTART for EL2, which may be serviced by KVM or the hyp-stub. (or maybe
> something else that implements the hyp-stub api)
>
> For kexec, on non-VHE both EL1&0 and EL2 get disabled.
Yes. I see the latest code in SYM_CODE_START(__kvm_handle_stub_hvc)
does the things exactly as what you said.
>
>
> > diff --git a/arch/arm64/kernel/relocate_kernel.S b/arch/arm64/kernel/relocate_kernel.S
> > index 542d6ed..84eec95 100644
> > --- a/arch/arm64/kernel/relocate_kernel.S
> > +++ b/arch/arm64/kernel/relocate_kernel.S
> > @@ -36,18 +36,6 @@ SYM_CODE_START(arm64_relocate_new_kernel)
> >       mov     x14, xzr                        /* x14 = entry ptr */
> >       mov     x13, xzr                        /* x13 = copy dest */
> >
> > -     /* Clear the sctlr_el2 flags. */
> > -     mrs     x0, CurrentEL
> > -     cmp     x0, #CurrentEL_EL2
> > -     b.ne    1f
> > -     mrs     x0, sctlr_el2
> > -     mov_q   x1, SCTLR_ELx_FLAGS
> > -     bic     x0, x0, x1
> > -     pre_disable_mmu_workaround
> > -     msr     sctlr_el2, x0
> > -     isb
> > -1:
>
> I agree this doesn't disable the MMU anymore.
>
> This was originally kept to disable the I+C bits when Kdump interrupted KVM, but since KVM
> formalised the hyp-stub API, and has this exact sequence to back its HVC_SOFT_RESTART, it
> was only needed for the hyp-stub itself, which has no clue about these SCTLR_EL2 bits.
>
> HVC_SOFT_RESTART only says it needs to disable the MMU. See
> Documentation/virt/kvm/arm/hyp-abi.rst
>
>
> I think its fine to remove this, but the reason is because el2_setup doesn't set those
> bits, and KVM clears them when its unloaded, or has a HVC_SOFT_RESTART call.
>
> It might be worth updating the document, but we'd need to check the guarantee is the same
> on 32bit. I assume there is no out-of-tree user of the hyp-stub abi.
I have no idea about 32bit. Could you enlighten me?

Again, I am a fresh man on arm/kvm. Please forgive me if I make any
silly mistakes.

Thanks for your time.

Pingfan



More information about the linux-arm-kernel mailing list