[PATCH 01/10] KVM: arm64: Fix clobbered ELR in sync abort
Marc Zyngier
maz at kernel.org
Sun Mar 17 05:26:22 PDT 2024
On Thu, 14 Mar 2024 20:23:22 +0000,
Pierre-Clément Tosi <ptosi at google.com> wrote:
>
> When the hypervisor receives a SError or synchronous exception (EL2h)
nit: since this also affects SError, $subject seems wrong.
> while running with the __kvm_hyp_vector and if ELR_EL2 doesn't point to
> an extable entry, it panics indirectly by overwriting ELR with the
> address of a panic handler in order for the asm routine it returns to to
> ERET into the handler. This is done (instead of a simple function call)
> to ensure that the panic handler runs with the SPSel that was in use
> when the exception was triggered, necessary to support features such as
> the shadow call stack.
I don't understand this. Who changes SPsel? At any given point, SP_EL0
is that of either the host or the guest, but never the hypervisor's.
If something was touching SP_EL0 behind our back, it would result in
corruption (see 6e977984f6d8 for a bit of rationale about it).
>
> However, this clobbers ELR_EL2 for the handler itself. As a result,
> hyp_panic(), when retrieving what it believes to be the PC where the
> exception happened, actually ends up reading the address of the panic
> handler that called it! This results in an erroneous and confusing panic
> message where the source of any synchronous exception (e.g. BUG() or
> kCFI) appears to be __guest_exit_panic, making it hard to locate the
> actual BRK instruction.
>
> Therefore, store the original ELR_EL2 in a per-CPU struct and point the
Can you elaborate on *which* per-CPU structure you are using?
> sysreg to a routine that first restores it to its previous value before
> running __guest_exit_panic.
>
> Fixes: 7db21530479f ("KVM: arm64: Restore hyp when panicking in guest context")
> Signed-off-by: Pierre-Clément Tosi <ptosi at google.com>
> ---
> arch/arm64/kernel/asm-offsets.c | 1 +
> arch/arm64/kvm/hyp/entry.S | 9 +++++++++
> arch/arm64/kvm/hyp/include/hyp/switch.h | 6 ++++--
> 3 files changed, 14 insertions(+), 2 deletions(-)
>
> diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
> index 5a7dbbe0ce63..e62353168a57 100644
> --- a/arch/arm64/kernel/asm-offsets.c
> +++ b/arch/arm64/kernel/asm-offsets.c
> @@ -128,6 +128,7 @@ int main(void)
> DEFINE(VCPU_FAULT_DISR, offsetof(struct kvm_vcpu, arch.fault.disr_el1));
> DEFINE(VCPU_HCR_EL2, offsetof(struct kvm_vcpu, arch.hcr_el2));
> DEFINE(CPU_USER_PT_REGS, offsetof(struct kvm_cpu_context, regs));
> + DEFINE(CPU_ELR_EL2, offsetof(struct kvm_cpu_context, sys_regs[ELR_EL2]));
> DEFINE(CPU_RGSR_EL1, offsetof(struct kvm_cpu_context, sys_regs[RGSR_EL1]));
> DEFINE(CPU_GCR_EL1, offsetof(struct kvm_cpu_context, sys_regs[GCR_EL1]));
> DEFINE(CPU_APIAKEYLO_EL1, offsetof(struct kvm_cpu_context, sys_regs[APIAKEYLO_EL1]));
> diff --git a/arch/arm64/kvm/hyp/entry.S b/arch/arm64/kvm/hyp/entry.S
> index f3aa7738b477..9cdf46da3051 100644
> --- a/arch/arm64/kvm/hyp/entry.S
> +++ b/arch/arm64/kvm/hyp/entry.S
> @@ -83,6 +83,15 @@ alternative_else_nop_endif
> eret
> sb
>
> +SYM_INNER_LABEL(__guest_exit_panic_with_restored_elr, SYM_L_GLOBAL)
The name of the function isn't great. Crucially, 'with_restored_elr'
implies that ELR is already restored, while it is the hunk below that
does the 'restore' part.
Something like '__guest_exit_restore_elr_and_panic' seems more
appropriate.
> + // x0-x29,lr: hyp regs
> +
> + stp x0, x1, [sp, #-16]!
> + adr_this_cpu x0, kvm_hyp_ctxt, x1
> + ldr x0, [x0, #CPU_ELR_EL2]
> + msr elr_el2, x0
> + ldp x0, x1, [sp], #16
> +
> SYM_INNER_LABEL(__guest_exit_panic, SYM_L_GLOBAL)
> // x2-x29,lr: vcpu regs
> // vcpu x0-x1 on the stack
> diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
> index a038320cdb08..6a8dc8d3c193 100644
> --- a/arch/arm64/kvm/hyp/include/hyp/switch.h
> +++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
> @@ -747,7 +747,7 @@ static inline bool fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
>
> static inline void __kvm_unexpected_el2_exception(void)
> {
> - extern char __guest_exit_panic[];
> + extern char __guest_exit_panic_with_restored_elr[];
> unsigned long addr, fixup;
> struct kvm_exception_table_entry *entry, *end;
> unsigned long elr_el2 = read_sysreg(elr_el2);
> @@ -769,7 +769,9 @@ static inline void __kvm_unexpected_el2_exception(void)
> }
>
> /* Trigger a panic after restoring the hyp context. */
> - write_sysreg(__guest_exit_panic, elr_el2);
> + write_sysreg(__guest_exit_panic_with_restored_elr, elr_el2);
> +
> + this_cpu_ptr(&kvm_hyp_ctxt)->sys_regs[ELR_EL2] = elr_el2;
nit: placing the saving of ELR_EL2 before overriding the sysreg makes
the whole thing a bit more readable, specially given the poor choice
of 'elr_el2' as a variable (visually clashing with 'elr_el2' as a
system register).
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
More information about the linux-arm-kernel
mailing list