On Thu, Mar 14, 2024 at 2:29 PM Jiri Olsa <olsajiri@xxxxxxxxx> wrote: > > On Wed, Mar 13, 2024 at 05:25:35PM -0700, Alexei Starovoitov wrote: > > On Tue, Mar 12, 2024 at 6:53 PM 梦龙董 <dongmenglong.8@xxxxxxxxxxxxx> wrote: > > > > > > On Wed, Mar 13, 2024 at 12:42 AM Alexei Starovoitov > > > <alexei.starovoitov@xxxxxxxxx> wrote: > > > > > > > > On Mon, Mar 11, 2024 at 7:42 PM 梦龙董 <dongmenglong.8@xxxxxxxxxxxxx> wrote: > > > > > > > > [......] > > > > > > > > I see. > > > > I thought you're sharing the trampoline across attachments. > > > > (since bpf prog is the same). > > > > > > That seems to be a good idea, which I hadn't thought before. > > > > > > > But above approach cannot possibly work with a shared trampoline. > > > > You need to create individual trampoline for all attachment > > > > and point them to single bpf prog. > > > > > > > > tbh I'm less excited about this feature now, since sharing > > > > the prog across different attachments is nice, but it won't scale > > > > to thousands of attachments. > > > > I assumed that there will be a single trampoline with max(argno) > > > > across attachments and attach/detach will scale to thousands. > > > > > > > > With individual trampoline this will work for up to a hundred > > > > attachments max. > > > > > > What does "a hundred attachments max" means? Can't I > > > trace thousands of kernel functions with a bpf program of > > > tracing multi-link? > > > > I mean what time does it take to attach one program > > to 100 fentry-s ? > > What is the time for 1k and for 10k ? > > > > The kprobe multi test attaches to pretty much all funcs in > > /sys/kernel/tracing/available_filter_functions > > and it's fast enough to run in test_progs on every commit in bpf CI. > > See get_syms() in prog_tests/kprobe_multi_test.c > > > > Can this new multi fentry do that? > > and at what speed? > > The answer will decide how applicable this api is going to be. > > Generating different trampolines for every attach point > > is an approach as well. Pls benchmark it too. > > > > > > > > > > Let's step back. > > > > What is the exact use case you're trying to solve? > > > > Not an artificial one as selftest in patch 9, but the real use case? > > > > > > I have a tool, which is used to diagnose network problems, > > > and its name is "nettrace". It will trace many kernel functions, whose > > > function args contain "skb", like this: > > > > > > ./nettrace -p icmp > > > begin trace... > > > ***************** ffff889be8fbd500,ffff889be8fbcd00 *************** > > > [1272349.614564] [dev_gro_receive ] ICMP: 169.254.128.15 -> > > > 172.27.0.6 ping request, seq: 48220 > > > [1272349.614579] [__netif_receive_skb_core] ICMP: 169.254.128.15 -> > > > 172.27.0.6 ping request, seq: 48220 > > > [1272349.614585] [ip_rcv ] ICMP: 169.254.128.15 -> > > > 172.27.0.6 ping request, seq: 48220 > > > [1272349.614592] [ip_rcv_core ] ICMP: 169.254.128.15 -> > > > 172.27.0.6 ping request, seq: 48220 > > > [1272349.614599] [skb_clone ] ICMP: 169.254.128.15 -> > > > 172.27.0.6 ping request, seq: 48220 > > > [1272349.614616] [nf_hook_slow ] ICMP: 169.254.128.15 -> > > > 172.27.0.6 ping request, seq: 48220 > > > [1272349.614629] [nft_do_chain ] ICMP: 169.254.128.15 -> > > > 172.27.0.6 ping request, seq: 48220 > > > [1272349.614635] [ip_rcv_finish ] ICMP: 169.254.128.15 -> > > > 172.27.0.6 ping request, seq: 48220 > > > [1272349.614643] [ip_route_input_slow ] ICMP: 169.254.128.15 -> > > > 172.27.0.6 ping request, seq: 48220 > > > [1272349.614647] [fib_validate_source ] ICMP: 169.254.128.15 -> > > > 172.27.0.6 ping request, seq: 48220 > > > [1272349.614652] [ip_local_deliver ] ICMP: 169.254.128.15 -> > > > 172.27.0.6 ping request, seq: 48220 > > > [1272349.614658] [nf_hook_slow ] ICMP: 169.254.128.15 -> > > > 172.27.0.6 ping request, seq: 48220 > > > [1272349.614663] [ip_local_deliver_finish] ICMP: 169.254.128.15 -> > > > 172.27.0.6 ping request, seq: 48220 > > > [1272349.614666] [icmp_rcv ] ICMP: 169.254.128.15 -> > > > 172.27.0.6 ping request, seq: 48220 > > > [1272349.614671] [icmp_echo ] ICMP: 169.254.128.15 -> > > > 172.27.0.6 ping request, seq: 48220 > > > [1272349.614675] [icmp_reply ] ICMP: 169.254.128.15 -> > > > 172.27.0.6 ping request, seq: 48220 > > > [1272349.614715] [consume_skb ] ICMP: 169.254.128.15 -> > > > 172.27.0.6 ping request, seq: 48220 > > > [1272349.614722] [packet_rcv ] ICMP: 169.254.128.15 -> > > > 172.27.0.6 ping request, seq: 48220 > > > [1272349.614725] [consume_skb ] ICMP: 169.254.128.15 -> > > > 172.27.0.6 ping request, seq: 48220 > > > > > > For now, I have to create a bpf program for every kernel > > > function that I want to trace, which is up to 200. > > > > > > With this multi-link, I only need to create 5 bpf program, > > > like this: > > > > > > int BPF_PROG(trace_skb_1, struct *skb); > > > int BPF_PROG(trace_skb_2, u64 arg0, struct *skb); > > > int BPF_PROG(trace_skb_3, u64 arg0, u64 arg1, struct *skb); > > > int BPF_PROG(trace_skb_4, u64 arg0, u64 arg1, u64 arg2, struct *skb); > > > int BPF_PROG(trace_skb_5, u64 arg0, u64 arg1, u64 arg2, u64 arg3, struct *skb); > > > > > > Then, I can attach trace_skb_1 to all the kernel functions that > > > I want to trace and whose first arg is skb; attach trace_skb_2 to kernel > > > functions whose 2nd arg is skb, etc. > > > > > > Or, I can create only one bpf program and store the index > > > of skb to the attachment cookie, and attach this program to all > > > the kernel functions that I want to trace. > > > > > > This is my use case. With the multi-link, now I only have > > > 1 bpf program, 1 bpf link, 200 trampolines, instead of 200 > > > bpf programs, 200 bpf link and 200 trampolines. > > > > I see. The use case makes sense to me. > > Andrii's retsnoop is used to do similar thing before kprobe multi was > > introduced. > > > > > The shared trampoline you mentioned seems to be a > > > wonderful idea, which can make the 200 trampolines > > > to one. Let me have a look, we create a trampoline and > > > record the max args count of all the target functions, let's > > > mark it as arg_count. > > > > > > During generating the trampoline, we assume that the > > > function args count is arg_count. During attaching, we > > > check the consistency of all the target functions, just like > > > what we do now. > > > > For one trampoline to handle all attach points we might > > need some arch support, but we can start simple. > > Make btf_func_model with MAX_BPF_FUNC_REG_ARGS > > by calling btf_distill_func_proto() with func==NULL. > > And use that to build a trampoline. > > > > The challenge is how to use minimal number of trampolines > > when bpf_progA is attached for func1, func2, func3 > > and bpf_progB is attached to func3, func4, func5. > > We'd still need 3 trampolines: > > for func[12] to call bpf_progA, > > for func3 to call bpf_progA and bpf_progB, > > for func[45] to call bpf_progB. > > > > Jiri was trying to solve it in the past. His slides from LPC: > > https://lpc.events/event/16/contributions/1350/attachments/1033/1983/plumbers.pdf > > > > Pls study them and his prior patchsets to avoid stepping on the same rakes. > > yep, I refrained from commenting not to take you down the same path > I did, but if you insist.. ;-) > > I managed to forgot almost all of it, but the IIRC the main pain point > was that at some point I had to split existing trampoline which caused > the whole trampolines management and error paths to become a mess > > I tried to explain things in [1] changelog and the latest patchset is in [0] > > feel free to use/take anything, but I advice strongly against it ;-) > please let me know if I can help I have to say that I have not gone far enough to encounter this problem, and I didn't dig enough to be aware of the complexity. I suspect that I can't overcome this challenge. The only thing that I thought when I hear about the "shared trampoline" is to fallback and not use the shared trampoline for the kernel functions who already have a trampoline. Anyway, let's have a try on it, based on your research. Thanks! Menglong Dong > > jirka > > > [0] https://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git/log/?h=bpf/batch > [1] https://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git/commit/?h=bpf/batch&id=52a1d4acdf55df41e99ca2cea51865e6821036ce