[PATCH 4/4] arm64: ftrace: Add return address protection

Wed Nov 30 09:45:04 PST 2022

On Wed, Nov 30, 2022 at 03:26:19PM +0100, Ard Biesheuvel wrote:
> On Wed, 30 Nov 2022 at 15:04, Mark Rutland <mark.rutland at arm.com> wrote:
> > On Tue, Nov 29, 2022 at 03:18:03PM +0100, Ard Biesheuvel wrote:
> > > Use the newly added asm macros to protect and restore the return address
> > > in the ftrace call wrappers, based on whichever method is active (PAC
> > > and/or shadow call stack).
> > >
> > > If the graph tracer is in use, this covers both the return address *to*
> > > the ftrace call site as well as the return address *at* the call site,
> > > and the latter will either be restored in return_to_handler(), or before
> > > returning to the call site.
> > >
> > > Signed-off-by: Ard Biesheuvel <ardb at kernel.org>
> > > ---
> > >  arch/arm64/kernel/entry-ftrace.S | 28 +++++++++++++++++++-
> > >  1 file changed, 27 insertions(+), 1 deletion(-)
> >
> > As a heads-up, this code has recently changed quite significantly, and this
> > won't apply to the version queued in arm64's for-next/{ftrace,core} branches.
> >
> > I had a direction of travel in mind with some changes for better stacktracing,
> > which won't work with the approach here, so I'd prefer we do this a bit
> > differently; more on that below.
> >
> > > diff --git a/arch/arm64/kernel/entry-ftrace.S b/arch/arm64/kernel/entry-ftrace.S
> > > index 795344ab4ec45889..c744e4dd8c90a352 100644
> > > --- a/arch/arm64/kernel/entry-ftrace.S
> > > +++ b/arch/arm64/kernel/entry-ftrace.S
> > > @@ -35,6 +35,11 @@
> > >   * is missing from the LR and existing chain of frame records.
> > >   */
> > >       .macro  ftrace_regs_entry, allregs=0
> > > +#ifdef CONFIG_FUNCTION_GRAPH_TRACER
> > > +     protect_return_address x9
> > > +#endif
> > > +     protect_return_address x30
> >
> > I think if we're going to protect the callsite's original LR (x9 here), we
> > should do that regardless of CONFIG_FUNCTION_GRAPH_TRACER; what matters is
> > whether that's vulnerable rather than whether we intend to modify it, so I
> > don't think it makes sene to protect it conditionally based on
> > CONFIG_FUNCTION_GRAPH_TRACER.
> 
> My reasoning was that if we are not going to return from it (in
> return_to_handler()), we can rely on the interrupted function to
> sign/authenticate it as usual. So the only reason for signing it here
> is so that we can authenticate it in return_to_handler() if that
> exists on the call path, removing a potentially vulnerable sequence
> from that function.

What I was trying to point out is that there is a window where this is spilled
to the stack (and hence is potentially vulnerable) between
ftrace_{caller,regs_caller}() and the end of ftrace_common().

So if we don't protect this when CONFIG_FUNCTION_GRAPH_TRACER=n, it could be
clobbered during that window (e.g. while function tracers are invoked),
*before* we return back into the instrumented function and sign the
(potentially already clobbered) value.

Hence, my thinking is that we should sign this regardless of
CONFIG_FUNCTION_GRAPH_TRACER to mitigate that case. I agree that we also want
it to be signed while it's in the graph return stack (i.e. until the
instrumented function returns back to return_to_handler()). In general, we
should sign the value if it's going to be spilled to the stack.

> > I'm a bit worried this might confuse some ftrace code manipulating the return
> > address (e.g. manipulation of the ftrace graph return stack), as I don't think
> > that's all PAC-clean, and might need some modification.
> 
> This is the reason for the xpaci instruction below.

Unfortunately, that alone isn't sufficient.

What I was alluding to is that this change means the ftrace graph return stack
contains signed addresses, and other code doesn't expect that. For example,
arm64's stacktrace code currently depends on the graph return stack containing
plain pointers, and so that gets broken as of this patch when function graph
tracing is enabled:

| # uname -a
| # Linux buildroot 6.1.0-rc7-00003-g44a67f0b8ac7 #1 SMP PREEMPT Wed Nov 30 17:19:38 GMT 2022 aarch64 GNU/Linux
| # cat /proc/self/stack
| [<0>] proc_pid_stack+0xc0/0x130
| [<0>] proc_single_show+0x68/0x120
| [<0>] seq_read_iter+0x16c/0x45c
| [<0>] seq_read+0x98/0xd0
| [<0>] vfs_read+0xc8/0x2c0
| [<0>] ksys_read+0x78/0x110
| [<0>] __arm64_sys_read+0x24/0x30
| [<0>] invoke_syscall+0x50/0x120
| [<0>] el0_svc_common.constprop.0+0xd4/0xf4
| [<0>] do_el0_svc+0x34/0xd0
| [<0>] el0_svc+0x2c/0x84
| [<0>] el0t_64_sync_handler+0xf4/0x120
| [<0>] el0t_64_sync+0x18c/0x190
| # echo function_graph > /sys/kernel/debug/tracing/current_tracer 
| # cat /proc/self/stack
| [<0>] 0xf5f98000083dff40
| [<0>] 0xd6b88000083e0f68
| [<0>] 0x21ac800008381ad0
| [<0>] 0xd0bc800008381e58
| [<0>] 0x22b280000834bc28
| [<0>] 0xf0ca80000834c5c8
| [<0>] 0x299080000834c684
| [<0>] 0xb1a1800008029cf0
| [<0>] 0x9bd0800008029e94
| [<0>] 0x1788800008029ee8
| [<0>] 0xa08680000916dd5c
| [<0>] el0t_64_sync_handler+0xf4/0x120
| [<0>] el0t_64_sync+0x18c/0x190

That's unfortunate (and would break RELIABLE_STACKTRACE, which we're slowly
getting towards being able to implement), but it's simple enough to account for
in the stacktrace code.

I have a fear that there are other cases where code tries to consume the graph
return stack (or to match against entries within it), which would be similarly
broken. I vaguely recall that we had issues of that shape in the past when we
tried to adjust the reported PC value, and would need to go page that in to
check that we don't open a similar issue here.

> > > +
> > >       /* Make room for pt_regs, plus a callee frame */
> > >       sub     sp, sp, #(PT_REGS_SIZE + 16)
> > >
> > > @@ -89,7 +94,9 @@ SYM_CODE_START(ftrace_caller)
> > >       b       ftrace_common
> > >  SYM_CODE_END(ftrace_caller)
> > >
> > > -SYM_CODE_START(ftrace_common)
> > > +SYM_CODE_START_LOCAL(ftrace_common)
> > > +     alternative_insn  nop, "xpaci x30", ARM64_HAS_ADDRESS_AUTH, IS_ENABLED(CONFIG_ARM64_PTR_AUTH_KERNEL)
> > > +
> > >       sub     x0, x30, #AARCH64_INSN_SIZE     // ip (callsite's BL insn)
> > >       mov     x1, x9                          // parent_ip (callsite's LR)
> > >       ldr_l   x2, function_trace_op           // op
> > > @@ -115,9 +122,27 @@ SYM_INNER_LABEL(ftrace_call, SYM_L_GLOBAL)
> > >       ldr     x30, [sp, #S_LR]
> > >       ldr     x9, [sp, #S_PC]
> > >
> > > +#ifdef CONFIG_FUNCTION_GRAPH_TRACER
> > > +     /* grab the original return address from the stack */
> > > +     ldr     x10, [sp, #PT_REGS_SIZE + 8]
> > > +#endif
> >
> > I'm planning to teach the stack unwinder how to unwind through ftrace_regs,
> > such that we wouldn't need to duplicate the LR in a frame record here, and so
> > we'd *only* have the copy inside the struct ftrace_regs.
> 
> Does doing so solve anything beyond reducing the stack footprint by 16 bytes?

My concern is functional rather than stack space. Having the single copy means
that it's not possible for the two copies to become out-of-sync, and so the
unwinder will always return the actual return address even when it has been
rewritten. Thats important for livepatching where that may be changed for
function redirection rather than tracing (and so there's not a return path to
balance against), and similarly for ftrace direct calls / trampolines, which we
might need to implement a variant of.

I've already implemented similar logic for unwinding through the pt_regs, and
I'm planning to clean that up and get it out after v6.2-rc1:

  https://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git/log/?h=arm64/stacktrace/metadata

> > I think we don't need the copy here if we sign the callsite's LR against the
> > base of the struct ftrace_regs. That way ftrace_graph_func() can sign the
> > updated return address, and this code wouldn't need to care. The ftrace_regs
> > have a copy of x18 that we can use to manipulate the SCS.
> 
> The updated return address will be signed when returning to the call
> site, and we never return from it here or anywhere else, so I don't
> think we need to sign it to begin with.

As above, there's a window where it's spilled ot the stack, and I think we
should protect it for that window where it has been spilled. Otherwise it can
be clobbered prior to being signed.

> What we need to sign here is the LR value that return_to_handler()
> will use, so ideally, we'd only sign the callsite's LR if we know we
> will be returning via return_to_handler().
> 
> > > +
> > >       /* Restore the callsite's SP */
> > >       add     sp, sp, #PT_REGS_SIZE + 16
> > >
> > > +     restore_return_address x9
> > > +#ifdef CONFIG_FUNCTION_GRAPH_TRACER
> > > +     /* compare the original return address with the actual one */
> > > +     cmp     x10, x30
> > > +     b.ne    0f
> > > +
> > > +     /*
> > > +      * If they are the same, unprotect it now. If it was modified, it will
> > > +      * be dealt with in return_to_handler() below.
> > > +      */
> > > +     restore_return_address x30
> > > +0:
> > > +#endif
> > >       ret     x9
> >
> > This means if the return address is clobbered, we'll blindly trust it without
> > authentication, which IMO undermines the point of signing it in the first
> > place.
> 
> How do you mean? x9 is authenticated here, and x30 will either be
> authenticated here or in return_to_handler()

I had confused myself here; sorry for the noise.

> > As above, I'd prefer that we had ftrace_graph_func() fix things up so that we
> > can unconditionally authenticate things here, which would be a bit stronger and
> > simpler to reason about.
> >
> 
> I think having all in one place makes it much easier to reason about,
> tbh. Adding additional handling of the PAC state as well as the shadow
> call stack in ftrace_graph_func() seems much more fiddly to me.

I appreciate that concern, but my intuition here is the inverse; I'd like to
avoid the conditionality in the regular tracing path to make that clearly
balanced and (from my perspective) easier to reason about.

I'm happy if we have to do a bit more work in ftrace_graph_func() and
return_to_handler() since those are already more special anyway.

Thanks,
Mark.