Re: [PATCHv3 bpf-next 00/13] bpf: Add kprobe multi link

Jiri Olsa <olsajiri@xxxxxxxxx> · Sat, 19 Mar 2022 13:27:54 +0100

On Fri, Mar 18, 2022 at 10:50:46PM -0700, Andrii Nakryiko wrote:
> On Wed, Mar 16, 2022 at 5:24 AM Jiri Olsa <jolsa@xxxxxxxxxx> wrote:
> >
> > hi,
> > this patchset adds new link type BPF_TRACE_KPROBE_MULTI that attaches
> > kprobe program through fprobe API [1] instroduced by Masami.
> >
> > The fprobe API allows to attach probe on multiple functions at once very
> > fast, because it works on top of ftrace. On the other hand this limits
> > the probe point to the function entry or return.
> >
> >
> > With bpftrace support I see following attach speed:
> >
> >   # perf stat --null -r 5 ./src/bpftrace -e 'kprobe:x* { } i:ms:1 { exit(); } '
> >   Attaching 2 probes...
> >   Attaching 3342 functions
> >   ...
> >
> >   1.4960 +- 0.0285 seconds time elapsed  ( +-  1.91% )
> >
> > v3 changes:
> >   - based on latest fprobe post from Masami [2]
> >   - add acks
> >   - add extra comment to kprobe_multi_link_handler wrt entry ip setup [Masami]
> >   - keep swap_words_64 static and swap values directly in
> >     bpf_kprobe_multi_cookie_swap [Andrii]
> >   - rearrange locking/migrate setup in kprobe_multi_link_prog_run [Andrii]
> >   - move uapi fields [Andrii]
> >   - add bpf_program__attach_kprobe_multi_opts function [Andrii]
> >   - many small test changes [Andrii]
> >   - added tests for bpf_program__attach_kprobe_multi_opts
> >   - make kallsyms_lookup_name check for empty string [Andrii]
> >
> > v2 changes:
> >   - based on latest fprobe changes [1]
> >   - renaming the uapi interface to kprobe multi
> >   - adding support for sort_r to pass user pointer for swap functions
> >     and using that in cookie support to keep just single functions array
> >   - moving new link to kernel/trace/bpf_trace.c file
> >   - using single fprobe callback function for entry and exit
> >   - using kvzalloc, libbpf_ensure_mem functions
> >   - adding new k[ret]probe.multi sections instead of using current kprobe
> >   - used glob_match from test_progs.c, added '?' matching
> >   - move bpf_get_func_ip verifier inline change to seprate change
> >   - couple of other minor fixes
> >
> >
> > Also available at:
> >   https://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git
> >   bpf/kprobe_multi
> >
> > thanks,
> > jirka
> >
> >
> > [1] https://lore.kernel.org/bpf/164458044634.586276.3261555265565111183.stgit@devnote2/
> > [2] https://lore.kernel.org/bpf/164735281449.1084943.12438881786173547153.stgit@devnote2/
> > ---
> > Jiri Olsa (13):
> >       lib/sort: Add priv pointer to swap function
> >       kallsyms: Skip the name search for empty string
> >       bpf: Add multi kprobe link
> >       bpf: Add bpf_get_func_ip kprobe helper for multi kprobe link
> >       bpf: Add support to inline bpf_get_func_ip helper on x86
> >       bpf: Add cookie support to programs attached with kprobe multi link
> >       libbpf: Add libbpf_kallsyms_parse function
> >       libbpf: Add bpf_link_create support for multi kprobes
> >       libbpf: Add bpf_program__attach_kprobe_multi_opts function
> >       selftests/bpf: Add kprobe_multi attach test
> >       selftests/bpf: Add kprobe_multi bpf_cookie test
> >       selftests/bpf: Add attach test for bpf_program__attach_kprobe_multi_opts
> >       selftests/bpf: Add cookie test for bpf_program__attach_kprobe_multi_opts
> >
> 
> Ok, so I've integrated multi-attach kprobes into retsnoop. It was
> pretty straightforward. Here are some numbers for the speed of
> attaching and, even more importantly, detaching for a set of almost
> 400 functions. It's a bit less functions for fentry-based mode due to
> more limited BTF information for static functions. Note that retsnoop
> attaches two BPF programs for each kernel function, so it's actually
> two multi-attach kprobes, each attaching to 397 functions. For
> single-attach kprobe, we perform 397 * 2 = 794 attachments.
> 
> I've been invoking retsnoop with the following specified set of
> functions: -e '*sys_bpf*' -a ':kernel/bpf/syscall.c' -a
> ':kernel/bpf/verifier.c' -a ':kernel/bpf/btf.c' -a
> ':kernel/bpf/core.c'. So basically bpf syscall entry functions and all
> the discoverable functions from syscall.c, verifier.c, core.c and
> btf.c from kernel/bpf subdirectory.
> 
> Results:
> 
> fentry attach/detach (263 kfuncs): 2133 ms / 31671  ms (33 seconds)
> kprobe attach/detach (397 kfuncs): 3121 ms / 13195 ms (16 seconds)
> multi-kprobe attach/detach (397 kfuncs): 9 ms / 248 ms (0.25 seconds)
> 
> So as you can see, the speed up is tremendous! API is also very
> convenient, I didn't have to modify retsnoop internals much to
> accommodate multi-attach API. Great job!

nice! thanks for doing that so quickly

> 
> Now for the bad news. :(
> 
> Stack traces from multi-attach kretprobe are broken, which makes all
> this way less useful.
> 
> Below, see stack traces captured with multi- and single- kretprobes
> for two different use cases. Single kprobe stack traces make much more
> sense. Ignore that last function doesn't have actual address
> associated with it (i.e. for __sys_bpf and bpf_tracing_prog_attach,
> respectively). That's just how retsnoop is doing things, I think. We
> actually were capturing stack traces from inside __sys_bpf (first
> case) and bpf_tracing_prog_attach (second case).
> 
> MULTI KPROBE:
> ffffffff81185a80 __sys_bpf+0x0
> (kernel/bpf/syscall.c:4622:1)
> 
> SINGLE KPROBE:
> ffffffff81e0007c entry_SYSCALL_64_after_hwframe+0x44
> (arch/x86/entry/entry_64.S:113:0)
> ffffffff81cd2b15 do_syscall_64+0x35
> (arch/x86/entry/common.c:80:7)
>                  . do_syscall_x64
> (arch/x86/entry/common.c:50:12)
> ffffffff811881aa __x64_sys_bpf+0x1a
> (kernel/bpf/syscall.c:4765:1)
>                  __sys_bpf
> 
> 
> MULTI KPROBE:
> ffffffff811851b0 bpf_tracing_prog_attach+0x0
> (kernel/bpf/syscall.c:2708:1)
> 
> SINGLE KPROBE:
> ffffffff81e0007c entry_SYSCALL_64_after_hwframe+0x44
> (arch/x86/entry/entry_64.S:113:0)
> ffffffff81cd2b15 do_syscall_64+0x35
> (arch/x86/entry/common.c:80:7)
>                  . do_syscall_x64
> (arch/x86/entry/common.c:50:12)
> ffffffff811881aa __x64_sys_bpf+0x1a
> (kernel/bpf/syscall.c:4765:1)
> ffffffff81185e79 __sys_bpf+0x3f9
> (kernel/bpf/syscall.c:4705:9)
> ffffffff8118583a bpf_raw_tracepoint_open+0x19a
> (kernel/bpf/syscall.c:3069:6)
>                  bpf_tracing_prog_attach
> 
> You can see that in multi-attach kprobe we only get one entry, which
> is the very last function in the stack trace. We have no parent
> function captured whatsoever. Jiri, Masami, any ideas what's wrong and
> how to fix this? Let's try to figure this out and fix it before the
> feature makes it into the kernel release. Thanks in advance!

oops, I should have tried kstack with the bpftrace's kretprobe, I see the same:

	# ./src/bpftrace -e 'kretprobe:x* { @[kstack] = count(); }'
	Attaching 1 probe...
	Attaching 3340probes
	^C

	@[
	    xfs_trans_apply_dquot_deltas+0
	]: 22
	@[
	    xlog_cil_commit+0
	]: 22
	@[
	    xlog_grant_push_threshold+0
	]: 22
	@[
	    xfs_trans_add_item+0
	]: 22
	@[
	    xfs_log_reserve+0
	]: 22
	@[
	    xlog_space_left+0
	]: 22
	@[
	    xfs_buf_offset+0
	]: 22
	@[
	    xfs_trans_free_dqinfo+0
	]: 22
	@[
	    xlog_ticket_alloc+0
	    xfs_log_reserve+5
	]: 22
	@[
	    xfs_cil_prepare_item+0

I think it's because we set original ip for return probe to have
bpf_get_func_ip working properly, but it breaks backtrace of course 

I'm not sure we could bring along the original regs for return probe,
but I have an idea how to workaround the bpf_get_func_ip issue and
keep the registers intact for other helpers

jirka