Re: [PATCH 1/3] perf/bpf: Remove prologue generation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Feb 10, 2022 at 1:31 PM Jiri Olsa <olsajiri@xxxxxxxxx> wrote:
>
> On Thu, Feb 10, 2022 at 04:18:10PM -0300, Arnaldo Carvalho de Melo wrote:
> > Em Sun, Jan 23, 2022 at 11:19:30PM +0100, Jiri Olsa escreveu:
> > > Removing code for ebpf program prologue generation.
> > >
> > > The prologue code was used to get data for extra arguments specified
> > > in program section name, like:
> > >
> > >   SEC("lock_page=__lock_page page->flags")
> > >   int lock_page(struct pt_regs *ctx, int err, unsigned long flags)
> > >   {
> > >          return 1;
> > >   }
> > >
> > > This code is using deprecated libbpf API and blocks its removal.
> > >
> > > This feature was not documented and broken for some time without
> > > anyone complaining, also original authors are not responding,
> > > so I'm removing it.
> >
> > So, the example below breaks, how hard would be to move the deprecated
> > APIs to perf like was done in some other cases?
> >

Just copy/pasting libbpf code won't work. But there are three parts:

1. bpf_(program|map|object)__set_priv(). There is no equivalent API,
but perf can maintain this private data by building hashmap where the
key is bpf_object/bpf_map/bpf_program pointer itself. Annoying but
very straightforward to replace.

2. For prologue generation, bpf_program__set_prep() doesn't have a
direct equivalent. But program cloning and adjustment of the code can
be achieved through bpf_program__insns()/bpf_program__insn_cnt() API
to load one "prototype" program, gets its underlying insns and clone
programs as necessary. After that, bpf_prog_load() would be used to
load those cloned programs into kernel.

3. Those *very* custom SEC() definitions will be possible for perf to
handle once [0] lands (I'll send new revision tomorrow, probably).
You'll be able to register your own "fallback" handler with
libbpf_register_prog_handler(NULL, ...).

  [0] https://patchwork.kernel.org/project/netdevbpf/list/?series=611491&state=*

Sorry, it's not a straightforward copy/paste, but I hope this helps a bit.

> > - Arnaldo
> >
> > Before:
> >
> > [root@quaco perf]# cat tools/perf/examples/bpf/5sec.c
> > // SPDX-License-Identifier: GPL-2.0
> > /*
> >     Description:
> >
> >     . Disable strace like syscall tracing (--no-syscalls), or try tracing
> >       just some (-e *sleep).
> >
> >     . Attach a filter function to a kernel function, returning when it should
> >       be considered, i.e. appear on the output.
> >
> >     . Run it system wide, so that any sleep of >= 5 seconds and < than 6
> >       seconds gets caught.
> >
> >     . Ask for callgraphs using DWARF info, so that userspace can be unwound
> >
> >     . While this is running, run something like "sleep 5s".
> >
> >     . If we decide to add tv_nsec as well, then it becomes:
> >
> >       int probe(hrtimer_nanosleep, rqtp->tv_sec rqtp->tv_nsec)(void *ctx, int err, long sec, long nsec)
> >
> >       I.e. add where it comes from (rqtp->tv_nsec) and where it will be
> >       accessible in the function body (nsec)
> >
> >     # perf trace --no-syscalls -e tools/perf/examples/bpf/5sec.c/call-graph=dwarf/
> >          0.000 perf_bpf_probe:func:(ffffffff9811b5f0) tv_sec=5
> >                                            hrtimer_nanosleep ([kernel.kallsyms])
> >                                            __x64_sys_nanosleep ([kernel.kallsyms])
> >                                            do_syscall_64 ([kernel.kallsyms])
> >                                            entry_SYSCALL_64 ([kernel.kallsyms])
> >                                            __GI___nanosleep (/usr/lib64/libc-2.26.so)
> >                                            rpl_nanosleep (/usr/bin/sleep)
> >                                            xnanosleep (/usr/bin/sleep)
> >                                            main (/usr/bin/sleep)
> >                                            __libc_start_main (/usr/lib64/libc-2.26.so)
> >                                            _start (/usr/bin/sleep)
> >     ^C#
> >
> >    Copyright (C) 2018 Red Hat, Inc., Arnaldo Carvalho de Melo <acme@xxxxxxxxxx>
> > */
> >
> > #include <bpf.h>
> >
> > #define NSEC_PER_SEC  1000000000L
> >
> > int probe(hrtimer_nanosleep, rqtp)(void *ctx, int err, long long sec)
> > {
> >       return sec / NSEC_PER_SEC == 5ULL;
> > }
>
> that sucks ;-) I'll check if we can re-implement as we discussed earlier,
> however below is workaround how to do it without the prologue support
>
> jirka
>
>
> ---
> diff --git a/tools/perf/examples/bpf/5sec.c b/tools/perf/examples/bpf/5sec.c
> index e6b6181c6dc6..734d39debdb8 100644
> --- a/tools/perf/examples/bpf/5sec.c
> +++ b/tools/perf/examples/bpf/5sec.c
> @@ -43,9 +43,17 @@
>
>  #define NSEC_PER_SEC   1000000000L
>
> -int probe(hrtimer_nanosleep, rqtp)(void *ctx, int err, long long sec)
> +struct pt_regs {
> +        long di;
> +};
> +
> +SEC("function=hrtimer_nanosleep")
> +int krava(struct pt_regs *ctx)
>  {
> -       return sec / NSEC_PER_SEC == 5ULL;
> +       unsigned long arg;
> +
> +       probe_read_kernel(&arg, sizeof(arg), __builtin_preserve_access_index(&ctx->di));
> +       return arg / NSEC_PER_SEC == 5ULL;
>  }
>
>  license(GPL);
> diff --git a/tools/perf/include/bpf/bpf.h b/tools/perf/include/bpf/bpf.h
> index b422aeef5339..b7d6d2fc8342 100644
> --- a/tools/perf/include/bpf/bpf.h
> +++ b/tools/perf/include/bpf/bpf.h
> @@ -64,6 +64,7 @@ int _version SEC("version") = LINUX_VERSION_CODE;
>
>  static int (*probe_read)(void *dst, int size, const void *unsafe_addr) = (void *)BPF_FUNC_probe_read;
>  static int (*probe_read_str)(void *dst, int size, const void *unsafe_addr) = (void *)BPF_FUNC_probe_read_str;
> +static long (*probe_read_kernel)(void *dst, __u32 size, const void *unsafe_ptr) = (void *) BPF_FUNC_probe_read_kernel;
>
>  static int (*perf_event_output)(void *, struct bpf_map *, int, void *, unsigned long) = (void *)BPF_FUNC_perf_event_output;
>
> diff --git a/tools/perf/util/llvm-utils.c b/tools/perf/util/llvm-utils.c
> index 96c8ef60f4f8..9274a3373847 100644
> --- a/tools/perf/util/llvm-utils.c
> +++ b/tools/perf/util/llvm-utils.c
> @@ -25,7 +25,7 @@
>                 "$CLANG_OPTIONS $PERF_BPF_INC_OPTIONS $KERNEL_INC_OPTIONS " \
>                 "-Wno-unused-value -Wno-pointer-sign "          \
>                 "-working-directory $WORKING_DIR "              \
> -               "-c \"$CLANG_SOURCE\" -target bpf $CLANG_EMIT_LLVM -O2 -o - $LLVM_OPTIONS_PIPE"
> +               "-g -c \"$CLANG_SOURCE\" -target bpf $CLANG_EMIT_LLVM -O2 -o - $LLVM_OPTIONS_PIPE"
>
>  struct llvm_param llvm_param = {
>         .clang_path = "clang",



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux