[PATCH v7 5/6] KVM: arm64: Detect and handle hypervisor stack overflows

Kalesh Singh kaleshsingh at google.com
Wed Apr 20 14:51:12 PDT 2022


On Mon, Apr 18, 2022 at 7:41 PM Kalesh Singh <kaleshsingh at google.com> wrote:
>
> On Mon, Apr 18, 2022 at 3:09 AM Marc Zyngier <maz at kernel.org> wrote:
> >
> > On Fri, 08 Apr 2022 21:03:28 +0100,
> > Kalesh Singh <kaleshsingh at google.com> wrote:
> > >
> > > The hypervisor stacks (for both nVHE Hyp mode and nVHE protected mode)
> > > are aligned such  that any valid stack address has PAGE_SHIFT bit as 1.
> > > This allows us to conveniently check for overflow in the exception entry
> > > without corrupting any GPRs. We won't recover from a stack overflow so
> > > panic the hypervisor.
> > >
> > > Signed-off-by: Kalesh Singh <kaleshsingh at google.com>
> > > Tested-by: Fuad Tabba <tabba at google.com>
> > > Reviewed-by: Fuad Tabba <tabba at google.com>
> > > ---
> > >
> > > Changes in v7:
> > >   - Add Fuad's Reviewed-by and Tested-by tags.
> > >
> > > Changes in v5:
> > >   - Valid stack addresses now have PAGE_SHIFT bit as 1 instead of 0
> > >
> > > Changes in v3:
> > >   - Remove test_sp_overflow macro, per Mark
> > >   - Add asmlinkage attribute for hyp_panic, hyp_panic_bad_stack, per Ard
> > >
> > >
> > >  arch/arm64/kvm/hyp/nvhe/host.S   | 24 ++++++++++++++++++++++++
> > >  arch/arm64/kvm/hyp/nvhe/switch.c |  7 ++++++-
> > >  2 files changed, 30 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/arch/arm64/kvm/hyp/nvhe/host.S b/arch/arm64/kvm/hyp/nvhe/host.S
> > > index 3d613e721a75..be6d844279b1 100644
> > > --- a/arch/arm64/kvm/hyp/nvhe/host.S
> > > +++ b/arch/arm64/kvm/hyp/nvhe/host.S
> > > @@ -153,6 +153,18 @@ SYM_FUNC_END(__host_hvc)
> > >
> > >  .macro invalid_host_el2_vect
> > >       .align 7
> > > +
> > > +     /*
> > > +      * Test whether the SP has overflowed, without corrupting a GPR.
> > > +      * nVHE hypervisor stacks are aligned so that the PAGE_SHIFT bit
> > > +      * of SP should always be 1.
> > > +      */
> > > +     add     sp, sp, x0                      // sp' = sp + x0
> > > +     sub     x0, sp, x0                      // x0' = sp' - x0 = (sp + x0) - x0 = sp
> > > +     tbz     x0, #PAGE_SHIFT, .L__hyp_sp_overflow\@
> > > +     sub     x0, sp, x0                      // x0'' = sp' - x0' = (sp + x0) - sp = x0
> > > +     sub     sp, sp, x0                      // sp'' = sp' - x0 = (sp + x0) - x0 = sp
> > > +
> > >       /* If a guest is loaded, panic out of it. */
> > >       stp     x0, x1, [sp, #-16]!
> > >       get_loaded_vcpu x0, x1
> > > @@ -165,6 +177,18 @@ SYM_FUNC_END(__host_hvc)
> > >        * been partially clobbered by __host_enter.
> > >        */
> > >       b       hyp_panic
> > > +
> > > +.L__hyp_sp_overflow\@:
> > > +     /*
> > > +      * Reset SP to the top of the stack, to allow handling the hyp_panic.
> > > +      * This corrupts the stack but is ok, since we won't be attempting
> > > +      * any unwinding here.
> > > +      */
> > > +     ldr_this_cpu    x0, kvm_init_params + NVHE_INIT_STACK_HYP_VA, x1
> > > +     mov     sp, x0
> > > +
> > > +     bl      hyp_panic_bad_stack
> >
> > Why bl? You clearly don't expect to return here, given that you have
> > an ASM_BUG() right below, and that you are calling a __no_return
> > function. I think we should be consistent with the rest of the code
> > and just do a simple branch.
>
> The idea was to use bl  to give the hyp_panic_bad_stack() frame in the
> stack trace, which makes it easy to identify overflows. I can add a
> comment and drop the redundant ASM_BUG()

Sorry, my mistake here: bl will give us the current frame in the stack
trace (hyp_host_vector) so it doesn't affect hyp_panic_bad_stack (next
frame) being in the strace trace. Addressed in v8:
https://lore.kernel.org/r/20220420214317.3303360-6-kaleshsingh@google.com/

Thanks,
Kalesh

>
> Thanks,
> Kalesh
>
> >
> > It also gives us a chance to preserve an extra register from the
> > context.
> >
> > > +     ASM_BUG()
> > >  .endm
> > >
> > >  .macro invalid_host_el1_vect
> > > diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c
> > > index 6410d21d8695..703a5d3f611b 100644
> > > --- a/arch/arm64/kvm/hyp/nvhe/switch.c
> > > +++ b/arch/arm64/kvm/hyp/nvhe/switch.c
> > > @@ -347,7 +347,7 @@ int __kvm_vcpu_run(struct kvm_vcpu *vcpu)
> > >       return exit_code;
> > >  }
> > >
> > > -void __noreturn hyp_panic(void)
> > > +asmlinkage void __noreturn hyp_panic(void)
> > >  {
> > >       u64 spsr = read_sysreg_el2(SYS_SPSR);
> > >       u64 elr = read_sysreg_el2(SYS_ELR);
> > > @@ -369,6 +369,11 @@ void __noreturn hyp_panic(void)
> > >       unreachable();
> > >  }
> > >
> > > +asmlinkage void __noreturn hyp_panic_bad_stack(void)
> > > +{
> > > +     hyp_panic();
> > > +}
> > > +
> > >  asmlinkage void kvm_unexpected_el2_exception(void)
> > >  {
> > >       return __kvm_unexpected_el2_exception();
> > > --
> > > 2.35.1.1178.g4f1659d476-goog
> > >
> > >
> >
> > Thanks,
> >
> >         M.
> >
> > --
> > Without deviation from the norm, progress is not possible.



More information about the linux-arm-kernel mailing list