[RFC PATCH 0/3] arm64: Implement reliable stack trace

Mon Feb 1 11:02:25 EST 2021

Hi Madhavan,

On Mon, Feb 01, 2021 at 09:21:43AM -0600, Madhavan T. Venkataraman wrote:
> On 1/28/21 9:26 AM, Josh Poimboeuf wrote:
> >> If we're trusting the compiler we can probably just do that without any
> >> explicit support from the compiler - it should be doing the standard
> >> stuff unless we explicitly ask it to and if it isn't then it might be a
> >> result of a mismatch in assumptions rather than a deliberate decision to
> >> do something non-standard.  My understanding with objtool is that a big
> >> part of the idea is to provide a static check that the binary we end up
> >> with matches the assumptions that we are making so the fact that it's a
> >> separate implementation is important.
> > For C code, even if we trusted the compiler (which we don't), we still
> > have inline asm which the compiler doesn't have any visibility to, which
> > is more than capable of messing up frame pointers (we had several cases
> > of this in x86).
> > 
> > Getting the assembler to annotate which functions are FP could be
> > interesting but:
> > 
> > a) good luck getting the assembler folks to do that; they tend to be
> >    insistent on being ignorant of code semantics;
> > 
> > b) assembly often resembles spaghetti and the concept of a function is
> >    quite fluid; in fact many functions aren't annotated as such.
> 
> OK. Before this whole discussion, I did not know that the compiler cannot be trusted.

I think "the compiler cannot be trusted" is overly strong. We want to
*verify* that the compiler is doing what we expect, which might be more
than what it guarantees to do (and those expectations can change over
time), but it doesn't mean we should try to avoid the compiler wherever
possible.

For assembly I expect we'll need to do /some/ manual annotation (e.g.
for trampolines).

> So, it looks like objtool is definitely needed. However, I believe we can minimize
> the work objtool does by using a shadow stack.
> 
> I read Mark Brown's response to my shadow stack email. I agree with him. The shadow
> stack looks promising.
> 
> So, here is my suggestion for the shadow stack. This is just to start the discussion
> on the shadow stack.

Regarding unwinding, shadow stacks have the same problems as frame
records (considering exceptions/interrupts) in that the primary problem
is knowing *when* they are updated, and knowing *when* the LR is
valid or invalid or duplicated (in a frame record or shadow stack
entry).

Given that, I think that assuming we must use a shadow stack for
reliable unwinding would be jumping the gun.

> Prolog and epilog for C functions
> =================================
> 
> Some shadow stack prolog and epilog are needed. Let us add a new option to the compiler
> to generate extra no-ops at the beginning of a function for the prolog and just before
> return for the epilog so some other entity such as objtool can add its own prolog and
> epilog. This is so we don't have to trust the compiler and can maintain our own prolog
> and epilog.

Why wouldn't we ask the compiler to to this, and just check this in the
tooling?

... and if we can do that, why not just check the frame pointer
manipulation?

Note that functions can have multiple return paths, and there might not
be one epilogue. Also, today some functions can have special-cased
prologues for early checks, e.g.

| function:
| 	CBNZ	X0, _func
| 	RET
| 	STP	X29, X30, [SP, #-FRAME_SIZE]
| 	MOV	X29, X30
| 	...
| 	LDP	X29, X30, [SP], #FRAME_SIZE
| 	RET

... and forcing additional work in those could be detrimental.

> Objtool will check for the no-ops. If they are present, it will replace the no-ops with
> the shadow stack prolog and epilog. It can also check the frame pointer prolog and
> epilog.

I suspect this will interact poorly with patchable-function-entry, which
prefixes each instrumentable function with some NOPs.

> Then, it will set a flag in the symbol table entry of the function to indicate that
> the function has a proper prolog and epilog.

I think this boils down to having a prologue and epilogue check, which
seems sane.

> Prolog and epilog for assembly functions
> ========================================
> 
> The no-ops and frame pointer prolog and epilog can be added to assembly functions manually.
> Objtool will process them as above.
> 
> Decoding
> ========
> 
> To do all this, objtool has to decode only the following instructions.
> 
>         - no-op
>         - return instruction
> 	- store register pair in frame pointer prolog
> 	- load register pair in frame pointer epilog
> 
> This simplifies the objtool part a lot. AFAIK, all instructions in ARM64 are
> 32 bits wide. So, objtool does not have to decode an instruction to know its
> length.
> 
> Objtool has to scan a function for the return instruction to know the location(s)
> of the epilog.

That wouldn't be robust if you consider things like:

| func:
| 	STP	x29, x30, [SP, #-FRAME_SIZE]!
|	MOV	X29, SP
| 	B	__fake_ret
|	LDP	X29, X30, [SP], #FRAME_SIZE
|	RET
| __fake_ret:
| 	BL	x30

... which is the sort of thing we want objtool to catch.

[...]

> Unwinder logic
> ==============
> 
> The unwinder will walk the stack using frame pointers like it does
> currently. As it unwinds the regular stack, it will also unwind the
> shadow stack:
> 
> However, at each step, it needs to perform some additional checks:
> 
>         symbol = lookup symbol table entry for pc
>         if (!symbol)
>                 return -EINVAL;
> 
>         if (symbol does not have proper prolog and epilog)
>                 return -EINVAL;

I think at this point, we haven't gained anything from using a shadow
stack, and I'd much rather we used objtool to gather any metadata needed
to make unwinding reliable without mandating a shadow stack.

Thanks,
Mark.