On Wed, 21 Apr 2021 15:40:37 +0200 Jiri Olsa <jolsa@xxxxxxxxxx> wrote: > ok, I understand why this would be the best solution for calling > the program from multiple probes > > I think it's the 'attach' layer which is the source of problems > > currently there is ftrace's fgraph_ops support that allows fast mass > attach and calls callbacks for functions entry and exit: > https://lore.kernel.org/lkml/20190525031633.811342628@xxxxxxxxxxx/ > > these callbacks get ip/parent_ip and can get pt_regs (that's not > implemented at the moment) > > but that gets us to the situation of having full pt_regs on both > entry/exit callbacks that you described above and want to avoid, > but I think it's the price for having this on top of generic > tracing layer > > the way ftrace's fgraph_ops is implemented, I'm not sure it can > be as fast as current bpf entry/exit trampoline Note, the above mentioned code was an attempt to consolidate the code that does the "highjacking" of the return pointer in order to record the return of a function. At the time there was only kretprobes and function graph tracing. Now bpf has another version. That means there's three utilities that record the exit of the function. What we need is a single method that works for all three utilities. And I'm perfectly fine with a rewrite of function graph tracer to do that. The one problem is that function graph and kretprobes works for pretty much all the architectures now, and whatever we decide to do, we can't break those architectures. One way is to have an abstract layer that allows function graph and kretprobes to work with the old implementation as well as, depending on a config set, a new implementation that also supports bpf trampolines. > > but to better understand the pain points I think I'll try to implement > the 'mass trampolines' call to the bpf program you described above and > attach it for now to fgraph_ops callbacks One thing that ftrace gives you is a way to have each function call its own trampoline, then depending on what is attached, each one can have multiple implementations. One thing that needs to be fixed is the direct trampoline and function graph and kretprobes. As the direct trampoline will break both of them, with the bpf implementation to trace after it. I would be interested in what a mass generic trampoline would look like, if it had to deal with handling functions with 1 parameter and one with 12 parameters. From this thread, I was told it can currently only handle 6 parameters on x86_64. Not sure how it works on x86_32. > > perhaps this is a good topic to discuss in one of the Thursday's BPF mtg? I'm unaware of these meetings. -- Steve