[External] Re: [PATCH bpf-next v2 1/9] bpf: tracing: add support to record and check the accessed args
梦龙董
dongmenglong.8 at bytedance.com
Fri Mar 29 21:16:52 PDT 2024
On Sat, Mar 30, 2024 at 7:28 AM Andrii Nakryiko
<andrii.nakryiko at gmail.com> wrote:
>
> On Thu, Mar 28, 2024 at 8:10 AM Steven Rostedt <rostedt at goodmis.org> wrote:
> >
> > On Thu, 28 Mar 2024 22:43:46 +0800
> > 梦龙董 <dongmenglong.8 at bytedance.com> wrote:
> >
> > > I have done a simple benchmark on creating 1000
> > > trampolines. It is slow, quite slow, which consume up to
> > > 60s. We can't do it this way.
> > >
> > > Now, I have a bad idea. How about we introduce
> > > a "dynamic trampoline"? The basic logic of it can be:
> > >
> > > """
> > > save regs
> > > bpfs = trampoline_lookup_ip(ip)
> > > fentry = bpfs->fentries
> > > while fentry:
> > > fentry(ctx)
> > > fentry = fentry->next
> > >
> > > call origin
> > > save return value
> > >
> > > fexit = bpfs->fexits
> > > while fexit:
> > > fexit(ctx)
> > > fexit = fexit->next
> > >
> > > xxxxxx
> > > """
> > >
> > > And we lookup the "bpfs" by the function ip in a hash map
> > > in trampoline_lookup_ip. The type of "bpfs" is:
> > >
> > > struct bpf_array {
> > > struct bpf_prog *fentries;
> > > struct bpf_prog *fexits;
> > > struct bpf_prog *modify_returns;
> > > }
> > >
> > > When we need to attach the bpf progA to function A/B/C,
> > > we only need to create the bpf_arrayA, bpf_arrayB, bpf_arrayC
> > > and add the progA to them, and insert them to the hash map
> > > "direct_call_bpfs", and attach the "dynamic trampoline" to
> > > A/B/C. If bpf_arrayA exist, just add progA to the tail of
> > > bpf_arrayA->fentries. When we need to attach progB to
> > > B/C, just add progB to bpf_arrayB->fentries and
> > > bpf_arrayB->fentries.
> > >
> > > Compared to the trampoline, extra overhead is introduced
> > > by the hash lookuping.
> > >
> > > I have not begun to code yet, and I am not sure the overhead is
> > > acceptable. Considering that we also need to do hash lookup
> > > by the function in kprobe_multi, maybe the overhead is
> > > acceptable?
> >
> > Sounds like you are just recreating the function management that ftrace
> > has. It also can add thousands of trampolines very quickly, because it does
> > it in batches. It takes special synchronization steps to attach to fentry.
> > ftrace (and I believe multi-kprobes) updates all the attachments for each
> > step, so the synchronization needed is only done once.
> >
> > If you really want to have thousands of functions, why not just register it
> > with ftrace itself. It will give you the arguments via the ftrace_regs
> > structure. Can't you just register a program as the callback?
> >
> > It will probably make your accounting much easier, and just let ftrace
> > handle the fentry logic. That's what it was made to do.
> >
>
> I thought I'll just ask instead of digging through code, sorry for
> being lazy :) Is there any way to pass pt_regs/ftrace_regs captured
> before function execution to a return probe (fexit/kretprobe)? I.e.,
> how hard is it to pass input function arguments to a kretprobe? That's
> the biggest advantage of fexit over kretprobe, and if we can make
> these original pt_regs/ftrace_regs available to kretprobe, then
> multi-kretprobe will effectively be this multi-fexit.
Yes, we can use multi-kretprobe instead of multi-fexit
if we can obtain the function args in kretprobe.
I think it's hard. The reason that we can obtain the
function args is that we have a trampoline, and it
call the origin function for FEXIT. If we do the same
for multi-kretprobe, we need to modify ftrace_regs_caller
to:
ftrace_regs_caller
|
__ftrace_ops_list_func
|
call all multi-kprobe callbacks
|
call orgin
|
call all multi-kretprobe callbacks
|
call bpf trampoline(for TRACING)
However, this logic conflicts with bpf trampoline,
as it can also call the origin function. What's more,
the FENTRY should be called before the "call origin"
above.
I'm sure if I understand correctly, as I have not
figured out how multi-kretprobe works in fprobe.
Thanks!
Menglong Dong
>
> > -- Steve
More information about the linux-riscv
mailing list