On Thu, Mar 17, 2022 at 8:53 PM Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote: > > On Wed, Mar 16, 2022 at 5:26 AM Jiri Olsa <jolsa@xxxxxxxxxx> wrote: > > + > > +struct bpf_link * > > +bpf_program__attach_kprobe_multi_opts(const struct bpf_program *prog, > > + const char *pattern, > > + const struct bpf_kprobe_multi_opts *opts) > > +{ > > + LIBBPF_OPTS(bpf_link_create_opts, lopts); > > + struct kprobe_multi_resolve res = { > > + .pattern = pattern, > > + }; > > + struct bpf_link *link = NULL; > > + char errmsg[STRERR_BUFSIZE]; > > + const unsigned long *addrs; > > + int err, link_fd, prog_fd; > > + const __u64 *cookies; > > + const char **syms; > > + bool retprobe; > > + size_t cnt; > > + > > + if (!OPTS_VALID(opts, bpf_kprobe_multi_opts)) > > + return libbpf_err_ptr(-EINVAL); > > + > > + syms = OPTS_GET(opts, syms, false); > > + addrs = OPTS_GET(opts, addrs, false); > > + cnt = OPTS_GET(opts, cnt, false); > > + cookies = OPTS_GET(opts, cookies, false); > > + > > + if (!pattern && !addrs && !syms) > > + return libbpf_err_ptr(-EINVAL); > > + if (pattern && (addrs || syms || cookies || cnt)) > > + return libbpf_err_ptr(-EINVAL); > > + if (!pattern && !cnt) > > + return libbpf_err_ptr(-EINVAL); > > + if (addrs && syms) > > + return libbpf_err_ptr(-EINVAL); > > + > > + if (pattern) { > > + err = libbpf_kallsyms_parse(resolve_kprobe_multi_cb, &res); > > + if (err) > > + goto error; > > + if (!res.cnt) { > > + err = -ENOENT; > > + goto error; > > + } > > + addrs = res.addrs; > > + cnt = res.cnt; > > + } > > Thanks Jiri. > Great stuff and a major milestone! > I've applied Masami's and your patches to bpf-next. > > But the above needs more work. > Currently test_progs -t kprobe_multi > takes 4 seconds on lockdep+debug kernel. > Mainly because of the above loop. > > 18.05% test_progs [kernel.kallsyms] [k] > kallsyms_expand_symbol.constprop.4 > 12.53% test_progs libc-2.28.so [.] _IO_vfscanf > 6.31% test_progs [kernel.kallsyms] [k] number > 4.66% test_progs [kernel.kallsyms] [k] format_decode > 4.65% test_progs [kernel.kallsyms] [k] string_nocheck > > Single test_skel_api() subtest takes almost a second. > > A cache inside libbpf probably won't help. > Maybe introduce a bpf iterator for kallsyms? BPF iterator for kallsyms is a great idea! So many benefits: - it should be significantly more efficient *and* simpler than parsing /proc/kallsyms; - there were some upstream patches recording ksym length (i.e., function size), don't remember if that ever landed or not, but besides that the other complication of even exposing that to user space were concerns about /proc/kallsyms format being an ABI. With the BPF iterator we can easily provide that symbol size without any breakage. This would be great! - we can allow parameterizing iterator with options like: skip or include module symbols, specify a set of types of symbols (function, variable, etc), etc. This would speed everything up in common cases by not even decompressing irrelevant names. In short, kallsyms iterator would be an immensely useful for any sort of tracing tool that deals with kernel stack traces or kallsyms in general. But in this particular case, kprobe_multi_resolve_syms() implementation is extremely suboptimal. I didn't realize during review that kallsyms_lookup_name() is a linear scan... If that's not going to be changed to O(log(N)) some time soon, we need to reimplement kprobe_multi_resolve_syms(), probably. One way would be to sort user strings lexicographically and then do a linear scan over all kallsyms, for each symbol perform binary search over a sorted array of user strings. Stop once all the positions were "filled in" (we'd need to keep a bitmap or bool[], probably). This way it's going to be O(MlogN) instead of O(MN) as it is right now. BTW, Jiri, libbpf.map is supposed to have an alphabetically ordered list of functions, it would be good to move bpf_program__attach_kprobe_multi_opts a bit higher before libbpf_* functions. > > On the kernel side kprobe_multi_resolve_syms() looks similarly inefficient. > I'm not sure whether it would be a bottle neck though. > > Orthogonal to this issue please add a new stress test > to selftest/bpf that attaches to a lot of functions.