On Wed, Oct 06, 2021 at 03:37:16PM -0700, Andrii Nakryiko wrote: > On Wed, Oct 6, 2021 at 3:05 PM Jiri Olsa <jolsa@xxxxxxxxxx> wrote: > > > > On Wed, Oct 06, 2021 at 02:22:28PM -0700, Andrii Nakryiko wrote: > > > On Wed, Oct 6, 2021 at 1:06 PM Jiri Olsa <jolsa@xxxxxxxxxx> wrote: > > > > > > > > On Wed, Oct 06, 2021 at 09:17:39AM -0700, Andrii Nakryiko wrote: > > > > > On Wed, Oct 6, 2021 at 1:42 AM Jiri Olsa <jolsa@xxxxxxxxxx> wrote: > > > > > > > > > > > > hi, > > > > > > I'm hitting performance issue and soft lock ups with the new version > > > > > > of the patchset and the reason seems to be kallsyms lookup that we > > > > > > need to do for each btf id we want to attach > > > > > > > > > > > > I tried to change kallsyms_lookup_name linear search into rbtree search, > > > > > > but it has its own pitfalls like duplicate function names and it still > > > > > > seems not to be fast enough when you want to attach like 30k functions > > > > > > > > > > How not fast enough is it exactly? How long does it take? > > > > > > > > 30k functions takes 75 seconds for me, it's loop calling bpf_check_attach_target > > > > > > > > getting soft lock up messages: > > > > > > > > krava33 login: [ 168.896671] watchdog: BUG: soft lockup - CPU#1 stuck for 26s! [bpftrace:1087] > > > > > > > > > > That's without RB tree right? I was curious about the case of you > > > converting kallsyms to RB tree and it still being slow. Can't imagine > > > 30k queries against RB tree with ~160k kallsyms taking 75 seconds. > > > > yep, that's the standard kallsyms lookup api > > > > I need to make some adjustment for rbtree kalsyms code, I think I found > > a bug in there, so the numbers are probably better as you suggest > > ok, cool, let's see what are the new numbers then > > > > > > > > > But as I said, why not map BTF IDs into function names, sort function > > > names, and then pass over kallsyms once, doing binary search into a > > > sorted array of requested function names and then recording addr for > > > each. Then check that you found addresses for all functions (it also > > > leaves a question of what to do when we have multiple matching > > > functions, but it's a problem with any approach). If everything checks > > > out, you have a nice btf id -> func name -> func addr mapping. It's > > > O(N log(M)), which sounds like it shouldn't be slow. Definitely not > > > multiple seconds slow. > > > > ok, now that's clear to me, thanks for these details > > great > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > so I wonder we could 'fix this' by storing function address in BTF, > > > > > > which would cut kallsyms lookup completely, because it'd be done in > > > > > > compile time > > > > > > > > > > > > my first thought was to add extra BTF section for that, after discussion > > > > > > with Arnaldo perhaps we could be able to store extra 8 bytes after > > > > > > BTF_KIND_FUNC record, using one of the 'unused' bits in btf_type to > > > > > > indicate that? or new BTF_KIND_FUNC2 type? > > > > > > > > > > > > thoughts? > > > > > > > > > > I'm strongly against this, because (besides the BTF bloat reason) we > > > > > need similar mass attachment functionality for kprobe/kretprobe and > > > > > that one won't be relying on BTF FUNCs, so I think it's better to > > > > > stick to the same mechanism for figuring out the address of the > > > > > function. > > > > > > > > ok > > > > > > > > > > > > > > If RB tree is not feasible, we can do a linear search over unsorted > > > > > kallsyms and do binary search over sorted function names (derived from > > > > > BTF IDs). That would be O(Nlog(M)), where N is number of ksyms, M is > > > > > number of BTF IDs/functions-to-be-attached-to. If we did have an RB > > > > > tree for kallsyms (is it hard to support duplicates? why?) it could be > > > > > even faster O(Mlog(N)). > > > > > > > > I had issues with generic kallsyms rbtree in the post some time ago, > > > > I'll revisit it to check on details.. but having the tree with just > > > > btf id functions might clear that.. I'll check > > > > > > That's not what I'm proposing. See above. Please let me know if > > > something is not clear before going all in for RB tree implementation > > > :) > > > > > > > > > But while we are on topic, do you think (with ftrace changes you are > > > doing) it would be hard to support multi-attach for > > > kprobes/kretprobes? We now have bpf_link interface for attaching > > > kprobes, so API can be pretty aligned with fentry/fexit, except > > > instead of btf IDs we'd need to pass array of pointers of C strings, I > > > suppose. > > > > hum, I think kprobe/kretprobe is made of perf event (kprobe/kretprobe > > pmus), then you pass event fd and program fd to bpf link syscall, > > and it attaches bpf program to that perf event > > > > so perhaps the user interface would be array of perf events fds and prog fd > > > > also I think you can have just one probe for function, so we will not need > > to share kprobes for multiple users like we need for trampolines, so the > > attach logic will be simple > > creating thousands of perf_event FDs seems expensive (and you'll be > running into the limit of open files pretty soon in a lot of systems). > So I think for multi-attach we'll have to have a separate way where > you'd specify kernel function name (and maybe also offset). I'm just > saying that having BPF_LINK_CREATE command, it's easier (probably) to > extend this for kprobe multi-attach, than trying to retrofit this into > perf_event_open. ah true.. I wonder we could bypass perf by using directly kernel kprobe interface jirka