On Tue, Jun 07, 2022 at 09:10:32PM -0700, Alexei Starovoitov wrote: > On Tue, Jun 7, 2022 at 12:56 PM Jiri Olsa <olsajiri@xxxxxxxxx> wrote: > > > > On Tue, Jun 07, 2022 at 11:40:47AM -0700, Alexei Starovoitov wrote: > > > On Mon, Jun 6, 2022 at 11:48 AM Jiri Olsa <jolsa@xxxxxxxxxx> wrote: > > > > > > > > When user specifies symbols and cookies for kprobe_multi link > > > > interface it's very likely the cookies will be misplaced and > > > > returned to wrong functions (via get_attach_cookie helper). > > > > > > > > The reason is that to resolve the provided functions we sort > > > > them before passing them to ftrace_lookup_symbols, but we do > > > > not do the same sort on the cookie values. > > > > > > > > Fixing this by using sort_r function with custom swap callback > > > > that swaps cookie values as well. > > > > > > > > Fixes: 0236fec57a15 ("bpf: Resolve symbols with ftrace_lookup_symbols for kprobe multi link") > > > > Signed-off-by: Jiri Olsa <jolsa@xxxxxxxxxx> > > > > > > It looks good, but something in this patch is causing a regression: > > > ./test_progs -t kprobe_multi > > > test_kprobe_multi_test:PASS:load_kallsyms 0 nsec > > > #80/1 kprobe_multi_test/skel_api:OK > > > #80/2 kprobe_multi_test/link_api_addrs:OK > > > #80/3 kprobe_multi_test/link_api_syms:OK > > > #80/4 kprobe_multi_test/attach_api_pattern:OK > > > #80/5 kprobe_multi_test/attach_api_addrs:OK > > > #80/6 kprobe_multi_test/attach_api_syms:OK > > > #80/7 kprobe_multi_test/attach_api_fails:OK > > > test_bench_attach:PASS:get_syms 0 nsec > > > test_bench_attach:PASS:kprobe_multi_empty__open_and_load 0 nsec > > > libbpf: prog 'test_kprobe_empty': failed to attach: No such process > > > test_bench_attach:FAIL:bpf_program__attach_kprobe_multi_opts > > > unexpected error: -3 > > > #80/8 kprobe_multi_test/bench_attach:FAIL > > > #80 kprobe_multi_test:FAIL > > > > looks like kallsyms search failed to find some symbol, > > but I can't reproduce with: > > > > ./vmtest.sh -- ./test_progs -t kprobe_multi > > > > can you share .config you used? > > I don't think it's config related. > Patch 2 is doing: > > - if (!bsearch(&name, args->syms, args->cnt, sizeof(*args->syms), symbols_cmp)) > + sym = bsearch(&name, args->syms, args->cnt, sizeof(*args->syms), symbols_cmp); > + if (!sym) > + return 0; > + > + idx = sym - args->syms; > + if (args->addrs[idx]) > return 0; > > addr = ftrace_location(addr); > if (!addr) > return 0; > > - args->addrs[args->found++] = addr; > + args->addrs[idx] = addr; > + args->found++; > return args->found == args->cnt ? 1 : 0; > > There are plenty of functions with the same name > in available_filter_functions. > So > if (args->addrs[idx]) > return 0; > triggers for a lot of them. > At the end args->found != args->cnt. there's code in get_syms (prog_tests/kprobe_multi_test.c) that filters out duplicates > > Here is trivial debug patch: > diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c > index 601ccf1b2f09..c567cf56cb57 100644 > --- a/kernel/trace/ftrace.c > +++ b/kernel/trace/ftrace.c > @@ -8037,8 +8037,10 @@ static int kallsyms_callback(void *data, const > char *name, > return 0; > > idx = sym - args->syms; > - if (args->addrs[idx]) > + if (args->addrs[idx]) { > + printk("idx %x name %s\n", idx, name); > return 0; > + } > > addr = ftrace_location(addr); > if (!addr) > @@ -8078,6 +8080,7 @@ int ftrace_lookup_symbols(const char > **sorted_syms, size_t cnt, unsigned long *a > err = kallsyms_on_each_symbol(kallsyms_callback, &args); > if (err < 0) > return err; > + printk("found %zd cnt %zd\n", args.found, args.cnt); > return args.found == args.cnt ? 0 : -ESRCH; > } > > [ 13.096160] idx a500 name unregister_vclock > [ 13.096930] idx 82fb name pt_regs_offset > [ 13.106969] idx 92be name set_root > [ 13.107290] idx 4414 name event_function > [ 13.112570] idx 7d1d name phy_init > [ 13.114459] idx 7d13 name phy_exit > [ 13.114777] idx ab91 name watchdog > [ 13.115730] found 46921 cnt 47036 > > I don't understand how it works for you at all. > It passes in BPF CI only because we don't run > kprobe_multi_test/bench_attach there (yet). reproduced after I updated the tree today.. not sure why I did not see that before, going to check jirka