On Thu, Mar 28, 2024 at 8:10 AM Steven Rostedt <rostedt@xxxxxxxxxxx> wrote: > > On Thu, 28 Mar 2024 22:43:46 +0800 > 梦龙董 <dongmenglong.8@xxxxxxxxxxxxx> wrote: > > > I have done a simple benchmark on creating 1000 > > trampolines. It is slow, quite slow, which consume up to > > 60s. We can't do it this way. > > > > Now, I have a bad idea. How about we introduce > > a "dynamic trampoline"? The basic logic of it can be: > > > > """ > > save regs > > bpfs = trampoline_lookup_ip(ip) > > fentry = bpfs->fentries > > while fentry: > > fentry(ctx) > > fentry = fentry->next > > > > call origin > > save return value > > > > fexit = bpfs->fexits > > while fexit: > > fexit(ctx) > > fexit = fexit->next > > > > xxxxxx > > """ > > > > And we lookup the "bpfs" by the function ip in a hash map > > in trampoline_lookup_ip. The type of "bpfs" is: > > > > struct bpf_array { > > struct bpf_prog *fentries; > > struct bpf_prog *fexits; > > struct bpf_prog *modify_returns; > > } > > > > When we need to attach the bpf progA to function A/B/C, > > we only need to create the bpf_arrayA, bpf_arrayB, bpf_arrayC > > and add the progA to them, and insert them to the hash map > > "direct_call_bpfs", and attach the "dynamic trampoline" to > > A/B/C. If bpf_arrayA exist, just add progA to the tail of > > bpf_arrayA->fentries. When we need to attach progB to > > B/C, just add progB to bpf_arrayB->fentries and > > bpf_arrayB->fentries. > > > > Compared to the trampoline, extra overhead is introduced > > by the hash lookuping. > > > > I have not begun to code yet, and I am not sure the overhead is > > acceptable. Considering that we also need to do hash lookup > > by the function in kprobe_multi, maybe the overhead is > > acceptable? > > Sounds like you are just recreating the function management that ftrace > has. It also can add thousands of trampolines very quickly, because it does > it in batches. It takes special synchronization steps to attach to fentry. > ftrace (and I believe multi-kprobes) updates all the attachments for each > step, so the synchronization needed is only done once. > > If you really want to have thousands of functions, why not just register it > with ftrace itself. It will give you the arguments via the ftrace_regs > structure. Can't you just register a program as the callback? > > It will probably make your accounting much easier, and just let ftrace > handle the fentry logic. That's what it was made to do. > I thought I'll just ask instead of digging through code, sorry for being lazy :) Is there any way to pass pt_regs/ftrace_regs captured before function execution to a return probe (fexit/kretprobe)? I.e., how hard is it to pass input function arguments to a kretprobe? That's the biggest advantage of fexit over kretprobe, and if we can make these original pt_regs/ftrace_regs available to kretprobe, then multi-kretprobe will effectively be this multi-fexit. > -- Steve