[PATCH 4/4] arm64: ftrace: Add return address protection

Mark Rutland mark.rutland at arm.com
Thu Dec 1 06:40:34 PST 2022


On Thu, Dec 01, 2022 at 02:09:41PM +0100, Ard Biesheuvel wrote:
> On Wed, 30 Nov 2022 at 18:45, Mark Rutland <mark.rutland at arm.com> wrote:
> >
> > On Wed, Nov 30, 2022 at 03:26:19PM +0100, Ard Biesheuvel wrote:
> > > On Wed, 30 Nov 2022 at 15:04, Mark Rutland <mark.rutland at arm.com> wrote:
> > > > On Tue, Nov 29, 2022 at 03:18:03PM +0100, Ard Biesheuvel wrote:
> > > > > Use the newly added asm macros to protect and restore the return address
> > > > > in the ftrace call wrappers, based on whichever method is active (PAC
> > > > > and/or shadow call stack).
> > > > >
> > > > > If the graph tracer is in use, this covers both the return address *to*
> > > > > the ftrace call site as well as the return address *at* the call site,
> > > > > and the latter will either be restored in return_to_handler(), or before
> > > > > returning to the call site.
> > > > >
> > > > > Signed-off-by: Ard Biesheuvel <ardb at kernel.org>
> > > > > ---
> > > > >  arch/arm64/kernel/entry-ftrace.S | 28 +++++++++++++++++++-
> > > > >  1 file changed, 27 insertions(+), 1 deletion(-)
> > > >
> > > > As a heads-up, this code has recently changed quite significantly, and this
> > > > won't apply to the version queued in arm64's for-next/{ftrace,core} branches.
> > > >
> > > > I had a direction of travel in mind with some changes for better stacktracing,
> > > > which won't work with the approach here, so I'd prefer we do this a bit
> > > > differently; more on that below.
> > > >
> > > > > diff --git a/arch/arm64/kernel/entry-ftrace.S b/arch/arm64/kernel/entry-ftrace.S
> > > > > index 795344ab4ec45889..c744e4dd8c90a352 100644
> > > > > --- a/arch/arm64/kernel/entry-ftrace.S
> > > > > +++ b/arch/arm64/kernel/entry-ftrace.S
> > > > > @@ -35,6 +35,11 @@
> > > > >   * is missing from the LR and existing chain of frame records.
> > > > >   */
> > > > >       .macro  ftrace_regs_entry, allregs=0
> > > > > +#ifdef CONFIG_FUNCTION_GRAPH_TRACER
> > > > > +     protect_return_address x9
> > > > > +#endif
> > > > > +     protect_return_address x30
> > > >
> > > > I think if we're going to protect the callsite's original LR (x9 here), we
> > > > should do that regardless of CONFIG_FUNCTION_GRAPH_TRACER; what matters is
> > > > whether that's vulnerable rather than whether we intend to modify it, so I
> > > > don't think it makes sene to protect it conditionally based on
> > > > CONFIG_FUNCTION_GRAPH_TRACER.
> > >
> > > My reasoning was that if we are not going to return from it (in
> > > return_to_handler()), we can rely on the interrupted function to
> > > sign/authenticate it as usual. So the only reason for signing it here
> > > is so that we can authenticate it in return_to_handler() if that
> > > exists on the call path, removing a potentially vulnerable sequence
> > > from that function.
> >
> > What I was trying to point out is that there is a window where this is spilled
> > to the stack (and hence is potentially vulnerable) between
> > ftrace_{caller,regs_caller}() and the end of ftrace_common().
> >
> > So if we don't protect this when CONFIG_FUNCTION_GRAPH_TRACER=n, it could be
> > clobbered during that window (e.g. while function tracers are invoked),
> > *before* we return back into the instrumented function and sign the
> > (potentially already clobbered) value.
> 
> Agreed.
> 
> But to clarify, the intent of this series is not to add protection to
> ftrace, the intent is to get rid of the gadgets from the ftrace code
> that can be abused even if you don't use ftrace at all.

Ok; sorry for missing that; I'll need to think a little harder.

> > Hence, my thinking is that we should sign this regardless of
> > CONFIG_FUNCTION_GRAPH_TRACER to mitigate that case. I agree that we also want
> > it to be signed while it's in the graph return stack (i.e. until the
> > instrumented function returns back to return_to_handler()). In general, we
> > should sign the value if it's going to be spilled to the stack.
> 
> Sure, but it solves a different problem.

Fair enough!

I think we're agreed that something which solves both issues makes sense, even
if that's not necessary for the gadgetisation issue specifically?

> > > > I'm a bit worried this might confuse some ftrace code manipulating the return
> > > > address (e.g. manipulation of the ftrace graph return stack), as I don't think
> > > > that's all PAC-clean, and might need some modification.
> > >
> > > This is the reason for the xpaci instruction below.
> >
> > Unfortunately, that alone isn't sufficient.
> >
> > What I was alluding to is that this change means the ftrace graph return stack
> > contains signed addresses, and other code doesn't expect that. For example,
> > arm64's stacktrace code currently depends on the graph return stack containing
> > plain pointers, and so that gets broken as of this patch when function graph
> > tracing is enabled:
> >
> > | # uname -a
> > | # Linux buildroot 6.1.0-rc7-00003-g44a67f0b8ac7 #1 SMP PREEMPT Wed Nov 30 17:19:38 GMT 2022 aarch64 GNU/Linux
> > | # cat /proc/self/stack
> > | [<0>] proc_pid_stack+0xc0/0x130
> > | [<0>] proc_single_show+0x68/0x120
> > | [<0>] seq_read_iter+0x16c/0x45c
> > | [<0>] seq_read+0x98/0xd0
> > | [<0>] vfs_read+0xc8/0x2c0
> > | [<0>] ksys_read+0x78/0x110
> > | [<0>] __arm64_sys_read+0x24/0x30
> > | [<0>] invoke_syscall+0x50/0x120
> > | [<0>] el0_svc_common.constprop.0+0xd4/0xf4
> > | [<0>] do_el0_svc+0x34/0xd0
> > | [<0>] el0_svc+0x2c/0x84
> > | [<0>] el0t_64_sync_handler+0xf4/0x120
> > | [<0>] el0t_64_sync+0x18c/0x190
> > | # echo function_graph > /sys/kernel/debug/tracing/current_tracer
> > | # cat /proc/self/stack
> > | [<0>] 0xf5f98000083dff40
> > | [<0>] 0xd6b88000083e0f68
> > | [<0>] 0x21ac800008381ad0
> > | [<0>] 0xd0bc800008381e58
> > | [<0>] 0x22b280000834bc28
> > | [<0>] 0xf0ca80000834c5c8
> > | [<0>] 0x299080000834c684
> > | [<0>] 0xb1a1800008029cf0
> > | [<0>] 0x9bd0800008029e94
> > | [<0>] 0x1788800008029ee8
> > | [<0>] 0xa08680000916dd5c
> > | [<0>] el0t_64_sync_handler+0xf4/0x120
> > | [<0>] el0t_64_sync+0x18c/0x190
> >
> > That's unfortunate (and would break RELIABLE_STACKTRACE, which we're slowly
> > getting towards being able to implement), but it's simple enough to account for
> > in the stacktrace code.
> >
> 
> Indeed. Those functions should just strip the PAC bits, no?

For that case, yup. That was roughly what I meant about it being simple to deal
with in the stacktrace code. :)

> > I have a fear that there are other cases where code tries to consume the graph
> > return stack (or to match against entries within it), which would be similarly
> > broken. I vaguely recall that we had issues of that shape in the past when we
> > tried to adjust the reported PC value, and would need to go page that in to
> > check that we don't open a similar issue here.
> 
> OK

FWIW, I'm happy to go audit that, I just wanted to make sure we didn't forget
to do so, since it's not obvious that there are potential issues there.

[...]

> > > > >       /* Restore the callsite's SP */
> > > > >       add     sp, sp, #PT_REGS_SIZE + 16
> > > > >
> > > > > +     restore_return_address x9
> > > > > +#ifdef CONFIG_FUNCTION_GRAPH_TRACER
> > > > > +     /* compare the original return address with the actual one */
> > > > > +     cmp     x10, x30
> > > > > +     b.ne    0f
> > > > > +
> > > > > +     /*
> > > > > +      * If they are the same, unprotect it now. If it was modified, it will
> > > > > +      * be dealt with in return_to_handler() below.
> > > > > +      */
> > > > > +     restore_return_address x30
> > > > > +0:
> > > > > +#endif
> > > > >       ret     x9

> > > > As above, I'd prefer that we had ftrace_graph_func() fix things up so that we
> > > > can unconditionally authenticate things here, which would be a bit stronger and
> > > > simpler to reason about.
> > > >
> > >
> > > I think having all in one place makes it much easier to reason about,
> > > tbh. Adding additional handling of the PAC state as well as the shadow
> > > call stack in ftrace_graph_func() seems much more fiddly to me.
> >
> > I appreciate that concern, but my intuition here is the inverse; I'd like to
> > avoid the conditionality in the regular tracing path to make that clearly
> > balanced and (from my perspective) easier to reason about.
> >
> > I'm happy if we have to do a bit more work in ftrace_graph_func() and
> > return_to_handler() since those are already more special anyway.
> >
> 
> Fair enough. As long as the asm routines have a SCS pop or AUTIASP
> between reloading x30 and returning to it, I don't have any problems
> with that.

Sure; I think that's workable. I have a rough shape in mind, so I'll have a go
at that as an example and try to get back to you shortly.

With that in mind, I think we should also fix up
qcom_link_stack_sanitisation(), since that ends up creating a gadget of the form:

	MOV	X30, Xn
	RET

... and that can be fixed by leaving it to the compiler to save/restore x30,
whereupon it should create a frame record and all the usual PAC goodness.
Example patch below (reformatted into the usual arm64 inline asm style).

Thanks,
Mark.

---->8----
diff --git a/arch/arm64/kernel/proton-pack.c b/arch/arm64/kernel/proton-pack.c
index bfce41c2a53b3..9fc54facf1ccb 100644
--- a/arch/arm64/kernel/proton-pack.c
+++ b/arch/arm64/kernel/proton-pack.c
@@ -250,12 +250,13 @@ static noinstr void qcom_link_stack_sanitisation(void)
 {
        u64 tmp;
 
-       asm volatile("mov       %0, x30         \n"
-                    ".rept     16              \n"
-                    "bl        . + 4           \n"
-                    ".endr                     \n"
-                    "mov       x30, %0         \n"
-                    : "=&r" (tmp));
+       asm volatile(
+       "       .rept   16              \n"
+       "       bl      . + 4           \n"
+       "       .endr                   \n"
+       : "=&r" (tmp)
+       :
+       : "x30");
 }
 
 static bp_hardening_cb_t spectre_v2_get_sw_mitigation_cb(void)




More information about the linux-arm-kernel mailing list