[RFC PATCH 0/3] arm64: Implement reliable stack trace

Mon Feb 1 18:00:32 EST 2021

On Mon, Feb 01, 2021 at 03:38:53PM -0600, Madhavan T. Venkataraman wrote:
> So, I have a few questions from a livepatch perspective.
> 
> For livepatch, the kernel makes sure that task is not running when its stack is checked,
> correct?

Correct.

> The only possibility I can think of is that the task could have received an
> interrupt and could have been preempted at the end of the interrupt. The interrupt
> could have happened during the frame pointer prolog or epilog. Is this the problem case
> for livepatch?
> 
> If the unwinder could check a flag in the task that indicates that the task was interrupted,
> the unwinder could declare the stack trace unreliable. E.g., a (hacky) solution could
> be to set and clear the flag in preempt_schedule_irq() which takes a task off a CPU
> when it is preempted at the end of an interrupt. The flag would remain set while the task is not
> on a CPU.
> 
> Similarly, for exceptions, can we set a flag in a task indicating that it is processing
> an exception? Is there a top level exception handler where we can do this? Is there common
> code that exception handlers use where we can set this? Or, can we deduce this from ptregs->pstate
> that is saved for the task?
> 
> Mind you, the flag is advisory. If the unwinder has some way to unwind through an exception,
> more power to it.

For x86 (frame pointers), entry code uses ENCODE_FRAME_POINTER, which
creates a special pt_regs frame.

When the reliable unwinder sees the encoded regs on the stack, it knows
it encountered some asynchronous event, like preemption, and it marks
the stack unreliable.

> > Given that, I think that assuming we must use a shadow stack for
> > reliable unwinding would be jumping the gun.
> >
> 
> So, this is the problem I was considering. Let us say that a function properly sets up the
> frame pointer at the beginning and properly restores it to the previous value when it
> returns. But because of compiler bugs or some inline assembly code or other errant code,
> the frame pointer gets modified in the middle of the function. Then, the function calls
> another function. Then, the unwinder tries to unwind the stack. The unwinder has no
> way of knowing that the frame pointer was modified. To tackle this problem, Objtool
> has to laboriously walk all the code paths and track every modification to the stack and
> the frame pointer. And, if there are frame modifications, it has to fail the kernel build.
> Did I understand it correctly?

Yes, though it generally warns instead of failing the build.  But we
keep the warnings to zero as best we can.

BTW, the most common inline asm frame pointer bug we saw on x86 was a
call instruction which got inserted by GCC before the prologue -- or
sometimes there was no prologue because it was otherwise considered a
leaf function.

> In these cases, the shadow stack can be used to unwind the stack. The shadow stack has
> return addresses pushed on it. For livepatch purposes, this good enough.

We try to fix every warning.  For the few warnings we whitelist instead
of fixing, we make sure it's not a risk for live patching.

> >> Objtool will check for the no-ops. If they are present, it will replace the no-ops with
> >> the shadow stack prolog and epilog. It can also check the frame pointer prolog and
> >> epilog.
> > 
> > I suspect this will interact poorly with patchable-function-entry, which
> > prefixes each instrumentable function with some NOPs.
> > 
> 
> Objtool knows if the kernel was configured with tracing. The compiler inserts a fixed,
> known number of no-ops for tracing purposes. So, why is it difficult for objtool to
> find the prolog/epilog no-ops?

Objtool tries to stay out of the code generation business.  Because then
who's going to validate objtool's code :-)

And the compiler already does a decent job at generating it.

> > I think at this point, we haven't gained anything from using a shadow
> > stack, and I'd much rather we used objtool to gather any metadata needed
> > to make unwinding reliable without mandating a shadow stack.
> > 
> 
> I think we have gained something. Pushing the return addresses on the shadow stack
> and using them to unwind means that objtool does not have to decode every single
> instruction and track the changes to the stack and frame state. It also means
> that the kernel build does not have to be failed when some frame modification is
> detected by objtool.

How do we know the kernel has full and accurate CFI coverage?

The original version of objtool was an awk script which basically just
crudely looked for the prologue/epilogue instructions.  It mostly
worked.

But it wasn't 100%, and these days the prologue isn't always at the
beginning, and the epilogue is usually buried in the middle.  And
sometimes there are more stack pushes/pops hidden outside of the formal
prologue/epilogue.  Not to mention asm code which does all kinds of
crazy things.  And other edge cases, like leaf functions which don't
require frame pointers, and alternatives patching/paravirt/etc which can
muck with the stack layout at runtime.  Eventually we realized a "full
coverage" objtool is the wisest approach.

Also, a simpler version of objtool isn't really an option on the x86
side, since we now have a lot of other features relying on its full
coverage.  Other than the decoder, most of the objtool logic is
arch-independent.

-- 
Josh