[RFC PATCH -next v2 3/4] arm64/ftrace: support dynamically allocated trampolines

Mark Rutland mark.rutland at arm.com
Wed May 25 05:17:30 PDT 2022


On Thu, May 12, 2022 at 09:02:31PM +0900, Masami Hiramatsu wrote:
> On Wed, 11 May 2022 11:12:07 -0400
> Steven Rostedt <rostedt at goodmis.org> wrote:
> 
> > On Wed, 11 May 2022 23:34:50 +0900
> > Masami Hiramatsu <mhiramat at kernel.org> wrote:
> > 
> > > OK, so fregs::regs will have a subset of pt_regs, and accessibility of
> > > the registers depends on the architecture. If we can have a checker like
> > > 
> > > ftrace_regs_exist(fregs, reg_offset)
> > 
> > Or something. I'd have to see the use case.
> > 
> > > 
> > > kprobe on ftrace or fprobe user (BPF) can filter user's requests.
> > > I think I can introduce a flag for kprobes so that user can make a
> > > kprobe handler only using a subset of registers. 
> > > Maybe similar filter code is also needed for BPF 'user space' library
> > > because this check must be done when compiling BPF.
> > 
> > Is there any other case without full regs that the user would want anything
> > other than the args, stack pointer and instruction pointer?
> 
> For the kprobes APIs/events, yes, it needs to access to the registers
> which is used for local variables when probing inside the function body.
> However at the function entry, I think almost no use case. (BTW, pstate
> is a bit special, that may show the actual processor-level status
> (context), so for the debugging, user might want to read it.)

As before, if we really need PSTATE we *must* take an exception to get a
reliable snapshot (or to alter the value). So I'd really like to split this
into two cases:

* Where users *really* need PSTATE (or arbitrary GPRs), they use kprobes. That
  always takes an exception and they can have a complete, real struct pt_regs.

* Where users just need to capture a function call boundary, they use ftrace.
  That uses a trampoline without taking an exception, and they get the minimal
  set of registers relevant to the function call boundary (which does not
  include PSTATE or most GPRs).
 
> Thus the BPF use case via fprobes, I think there is no usecase.
> My concern is that the BPF may allow user program to access any
> field of pt_regs. Thus if the user miss-programmed, they may see
> a wrong value (I guess the fregs is not zero-filled) for unsaved
> registers.
> 
> > That is, have a flag that says "only_args" or something, that says they
> > will only get the registers for arguments, a stack pointer, and the
> > instruction pointer (note, the fregs may not have the instruction pointer
> > as that is passed to the the caller via the "ip" parameter. If the fregs
> > needs that, we can add a "ftrace_regs_set_ip()" before calling the
> > callback registered to the fprobe).
> 
> Yes, that is what I'm thinking. If "only_args" flag is set, BPF runtime
> must check the user program. And if it finds the program access the
> unsaved registers, it should stop executing.
> 
> BTW, "what register is saved" can be determined statically, thus I think
> we just need the offset for checking (for fprobe usecase, since it will
> set the ftrace_ops flag by itself.)

For arm64 I'd like to make this static, and have ftrace *always* capture a
minimal set of ftrace_regs, which would be:

  X0 to X8 inclusive
  SP
  PC
  LR
  FP

Since X0 to X8 + SP is all that we need for arguments and return values (per
the calling convention we use), and PC+LR+FP gives us everything we need for
unwinding and live patching.

I *might* want to add x18 to that when SCS is enabled, but I'm not immediately
sure.

Thanks,
Mark.



More information about the linux-arm-kernel mailing list