On Wed, Mar 16, 2022 at 5:26 AM Jiri Olsa <jolsa@xxxxxxxxxx> wrote: > + > +struct bpf_link * > +bpf_program__attach_kprobe_multi_opts(const struct bpf_program *prog, > + const char *pattern, > + const struct bpf_kprobe_multi_opts *opts) > +{ > + LIBBPF_OPTS(bpf_link_create_opts, lopts); > + struct kprobe_multi_resolve res = { > + .pattern = pattern, > + }; > + struct bpf_link *link = NULL; > + char errmsg[STRERR_BUFSIZE]; > + const unsigned long *addrs; > + int err, link_fd, prog_fd; > + const __u64 *cookies; > + const char **syms; > + bool retprobe; > + size_t cnt; > + > + if (!OPTS_VALID(opts, bpf_kprobe_multi_opts)) > + return libbpf_err_ptr(-EINVAL); > + > + syms = OPTS_GET(opts, syms, false); > + addrs = OPTS_GET(opts, addrs, false); > + cnt = OPTS_GET(opts, cnt, false); > + cookies = OPTS_GET(opts, cookies, false); > + > + if (!pattern && !addrs && !syms) > + return libbpf_err_ptr(-EINVAL); > + if (pattern && (addrs || syms || cookies || cnt)) > + return libbpf_err_ptr(-EINVAL); > + if (!pattern && !cnt) > + return libbpf_err_ptr(-EINVAL); > + if (addrs && syms) > + return libbpf_err_ptr(-EINVAL); > + > + if (pattern) { > + err = libbpf_kallsyms_parse(resolve_kprobe_multi_cb, &res); > + if (err) > + goto error; > + if (!res.cnt) { > + err = -ENOENT; > + goto error; > + } > + addrs = res.addrs; > + cnt = res.cnt; > + } Thanks Jiri. Great stuff and a major milestone! I've applied Masami's and your patches to bpf-next. But the above needs more work. Currently test_progs -t kprobe_multi takes 4 seconds on lockdep+debug kernel. Mainly because of the above loop. 18.05% test_progs [kernel.kallsyms] [k] kallsyms_expand_symbol.constprop.4 12.53% test_progs libc-2.28.so [.] _IO_vfscanf 6.31% test_progs [kernel.kallsyms] [k] number 4.66% test_progs [kernel.kallsyms] [k] format_decode 4.65% test_progs [kernel.kallsyms] [k] string_nocheck Single test_skel_api() subtest takes almost a second. A cache inside libbpf probably won't help. Maybe introduce a bpf iterator for kallsyms? On the kernel side kprobe_multi_resolve_syms() looks similarly inefficient. I'm not sure whether it would be a bottle neck though. Orthogonal to this issue please add a new stress test to selftest/bpf that attaches to a lot of functions.