Re: [RFCv3 00/19] x86/ftrace/bpf: Add batch support for direct/tracing attach

Jiri Olsa <jolsa@xxxxxxxxxx> · Sat, 19 Jun 2021 19:09:25 +0200

On Sat, Jun 19, 2021 at 09:19:57AM -0700, Yonghong Song wrote:
> 
> 
> On 6/19/21 1:33 AM, Jiri Olsa wrote:
> > On Thu, Jun 17, 2021 at 01:29:45PM -0700, Andrii Nakryiko wrote:
> > > On Sat, Jun 5, 2021 at 4:12 AM Jiri Olsa <jolsa@xxxxxxxxxx> wrote:
> > > > 
> > > > hi,
> > > > saga continues.. ;-) previous post is in here [1]
> > > > 
> > > > After another discussion with Steven, he mentioned that if we fix
> > > > the ftrace graph problem with direct functions, he'd be open to
> > > > add batch interface for direct ftrace functions.
> > > > 
> > > > He already had prove of concept fix for that, which I took and broke
> > > > up into several changes. I added the ftrace direct batch interface
> > > > and bpf new interface on top of that.
> > > > 
> > > > It's not so many patches after all, so I thought having them all
> > > > together will help the review, because they are all connected.
> > > > However I can break this up into separate patchsets if necessary.
> > > > 
> > > > This patchset contains:
> > > > 
> > > >    1) patches (1-4) that fix the ftrace graph tracing over the function
> > > >       with direct trampolines attached
> > > >    2) patches (5-8) that add batch interface for ftrace direct function
> > > >       register/unregister/modify
> > > >    3) patches (9-19) that add support to attach BPF program to multiple
> > > >       functions
> > > > 
> > > > In nutshell:
> > > > 
> > > > Ad 1) moves the graph tracing setup before the direct trampoline
> > > > prepares the stack, so they don't clash
> > > > 
> > > > Ad 2) uses ftrace_ops interface to register direct function with
> > > > all functions in ftrace_ops filter.
> > > > 
> > > > Ad 3) creates special program and trampoline type to allow attachment
> > > > of multiple functions to single program.
> > > > 
> > > > There're more detailed desriptions in related changelogs.
> > > > 
> > > > I have working bpftrace multi attachment code on top this. I briefly
> > > > checked retsnoop and I think it could use the new API as well.
> > > 
> > > Ok, so I had a bit of time and enthusiasm to try that with retsnoop.
> > > The ugly code is at [0] if you'd like to see what kind of changes I
> > > needed to make to use this (it won't work if you check it out because
> > > it needs your libbpf changes synced into submodule, which I only did
> > > locally). But here are some learnings from that experiment both to
> > > emphasize how important it is to make this work and how restrictive
> > > are some of the current limitations.
> > > 
> > > First, good news. Using this mass-attach API to attach to almost 1000
> > > kernel functions goes from
> > > 
> > > Plain fentry/fexit:
> > > ===================
> > > real    0m27.321s
> > > user    0m0.352s
> > > sys     0m20.919s
> > > 
> > > to
> > > 
> > > Mass-attach fentry/fexit:
> > > =========================
> > > real    0m2.728s
> > > user    0m0.329s
> > > sys     0m2.380s
> > 
> > I did not meassured the bpftrace speedup, because the new code
> > attached instantly ;-)
> > 
> > > 
> > > It's a 10x speed up. And a good chunk of those 2.7 seconds is in some
> > > preparatory steps not related to fentry/fexit stuff.
> > > 
> > > It's not exactly apples-to-apples, though, because the limitations you
> > > have right now prevents attaching both fentry and fexit programs to
> > > the same set of kernel functions. This makes it pretty useless for a
> > 
> > hum, you could do link_update with fexit program on the link fd,
> > like in the selftest, right?
> > 
> > > lot of cases, in particular for retsnoop. So I haven't really tested
> > > retsnoop end-to-end, I only verified that I do see fentries triggered,
> > > but can't have matching fexits. So the speed-up might be smaller due
> > > to additional fexit mass-attach (once that is allowed), but it's still
> > > a massive difference. So we absolutely need to get this optimization
> > > in.
> > > 
> > > Few more thoughts, if you'd like to plan some more work ahead ;)
> > > 
> > > 1. We need similar mass-attach functionality for kprobe/kretprobe, as
> > > there are use cases where kprobe are more useful than fentry (e.g., >6
> > > args funcs, or funcs with input arguments that are not supported by
> > > BPF verifier, like struct-by-value). It's not clear how to best
> > > represent this, given currently we attach kprobe through perf_event,
> > > but we'll need to think about this for sure.
> > 
> > I'm fighting with the '2 trampolines concept' at the moment, but the
> > mass attach for kprobes seems interesting ;-) will check
> > 
> > > 
> > > 2. To make mass-attach fentry/fexit useful for practical purposes, it
> > > would be really great to have an ability to fetch traced function's
> > > IP. I.e., if we fentry/fexit func kern_func_abc, bpf_get_func_ip()
> > > would return IP of that functions that matches the one in
> > > /proc/kallsyms. Right now I do very brittle hacks to do that.
> > 
> > so I hoped that we could store ip always in ctx-8 and have
> > the bpf_get_func_ip helper to access that, but the BPF_PROG
> > macro does not pass ctx value to the program, just args
> 
> ctx does pass to the bpf program. You can check BPF_PROG
> macro definition.

ah right, should have checked it.. so how about we change
trampoline code to store ip in ctx-8 and make bpf_get_func_ip(ctx)
to return [ctx-8]

I'll need to check if it's ok for the tracing helper to take
ctx as argument

thanks,
jirka