Re: [PATCH v14 00/19] tracing: fprobe: function_graph: Multi-function graph and fprobe on fgraph

Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> · Fri, 13 Sep 2024 14:23:38 -0700

On Fri, Sep 13, 2024 at 6:50 AM Masami Hiramatsu <mhiramat@xxxxxxxxxx> wrote:
>
> On Fri, 13 Sep 2024 21:45:15 +0900
> Masami Hiramatsu (Google) <mhiramat@xxxxxxxxxx> wrote:
>
> > On Fri, 13 Sep 2024 17:59:35 +0900
> > Masami Hiramatsu (Google) <mhiramat@xxxxxxxxxx> wrote:
> >
> > > >
> > > > Taking failing output from the test:
> > > >
> > > > > > > kprobe_multi_testmod_check:FAIL:kretprobe_test3_result unexpected kretprobe_test3_result: actual 0 != expected 1
> > > >
> > > > kretprobe_test3_result is a sort of identifier for a test condition,
> > > > you can find a corresponding line in user space .c file grepping for
> > > > that:
> > > >
> > > > ASSERT_EQ(skel->bss->kretprobe_testmod_test3_result, 1,
> > > > "kretprobe_test3_result");
> > > >
> > > > Most probably the problem is in:
> > > >
> > > > __u64 addr = bpf_get_func_ip(ctx);
> > >
> > > Yeah, and as I replyed to another thread, the problem is
> > > that the ftrace entry_ip is not symbol ip.
> > >
> > > We have ftrace_call_adjust() arch function for reverse
> > > direction (symbol ip to ftrace entry ip) but what we need
> > > here is the reverse translate function (ftrace entry to symbol)
> > >
> > > The easiest way is to use kallsyms to find it, but this is
> > > a bit costful (but it just increase bsearch in several levels).
> > > Other possible options are
> > >
> > >  - Change bpf_kprobe_multi_addrs_cmp() to accept a range
> > >    of address. [sym_addr, sym_addr + offset) returns true,
> > >    bpf_kprobe_multi_cookie() can find the entry address.
> > >    The offset depends on arch, but 16 is enough.
> >
> > Oops. no, this bpf_kprobe_multi_cookie() is used only for storing
> > test data. Not a general problem solving. (I saw kprobe_multi_check())
> >
> > So solving problem is much costly, we need to put more arch-
> > dependent in bpf_trace, and make sure it does reverse translation
> > of ftrace_call_adjust(). (this means if you expand arch support,
> > you need to add such implementation)
>
> OK, can you try this one?
>

I'm running out of time today, so I won't have time to try this, sorry.

But see my last reply, I think adjusting link->addrs once before
attachment is the way to go. It gives us fast and simple lookups at
runtime.

In my last reply I assumed that we won't need to keep a copy of
original addrs (because we can dynamically adjust for
bpf_kprobe_multi_link_fill_link_info()), but I now realize that we do
need that for bpf_get_func_ip() anyways.

Still, I'd rather have an extra link->adj_addrs with a copy and do a
quick and simple lookup at runtime. So I suggest going with that. At
the very worst case it's a few kilobytes of memory for thousands of
attached functions, no big deal, IMO.

It's vastly better than maintaining arch-specific reverse of
ftrace_call_adjust(), isn't it?

Jiri, any opinion here?

>
> From 81bc599911507215aa9faa1077a601880cbd654a Mon Sep 17 00:00:00 2001
> From: "Masami Hiramatsu (Google)" <mhiramat@xxxxxxxxxx>
> Date: Fri, 13 Sep 2024 21:43:46 +0900
> Subject: [PATCH] bpf: Add get_entry_ip() for arm64
>
> Add get_entry_ip() implementation for arm64. This is based on
> the information in ftrace_call_adjust() for arm64.
>
> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@xxxxxxxxxx>
> ---
>  kernel/trace/bpf_trace.c | 64 ++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 64 insertions(+)
>
> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> index deb629f4a510..b0cf6e5b8965 100644
> --- a/kernel/trace/bpf_trace.c
> +++ b/kernel/trace/bpf_trace.c
> @@ -1066,6 +1066,70 @@ static unsigned long get_entry_ip(unsigned long fentry_ip)
>                 fentry_ip -= ENDBR_INSN_SIZE;
>         return fentry_ip;
>  }
> +#elif defined (CONFIG_ARM64)
> +#include <asm/insn.h>
> +
> +static unsigned long get_entry_ip(unsigned long fentry_ip)
> +{
> +       u32 insn;
> +
> +       /*
> +        * When using patchable-function-entry without pre-function NOPS, ftrace
> +        * entry is the address of the first NOP after the function entry point.
> +        *
> +        * The compiler has either generated:
> +        *
> +        * func+00:     func:   NOP             // To be patched to MOV X9, LR
> +        * func+04:             NOP             // To be patched to BL <caller>
> +        *
> +        * Or:
> +        *
> +        * func-04:             BTI     C
> +        * func+00:     func:   NOP             // To be patched to MOV X9, LR
> +        * func+04:             NOP             // To be patched to BL <caller>
> +        *
> +        * The fentry_ip is the address of `BL <caller>` which is at `func + 4`
> +        * bytes in either case.
> +        */
> +       if (!IS_ENABLED(CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS))
> +               return fentry_ip - AARCH64_INSN_SIZE;
> +
> +       /*
> +        * When using patchable-function-entry with pre-function NOPs, BTI is
> +        * a bit different.
> +        *
> +        * func+00:     func:   NOP             // To be patched to MOV X9, LR
> +        * func+04:             NOP             // To be patched to BL <caller>
> +        *
> +        * Or:
> +        *
> +        * func+00:     func:   BTI     C
> +        * func+04:             NOP             // To be patched to MOV X9, LR
> +        * func+08:             NOP             // To be patched to BL <caller>
> +        *
> +        * The fentry_ip is the address of `BL <caller>` which is at either
> +        * `func + 4` or `func + 8` depends on whether there is a BTI.
> +        */
> +
> +       /* If there is no BTI, the func address should be one instruction before. */
> +       if (!IS_ENABLED(CONFIG_ARM64_BTI_KERNEL))
> +               return fentry_ip - AARCH64_INSN_SIZE;
> +
> +       /* We want to be extra safe in case entry ip is on the page edge,
> +        * but otherwise we need to avoid get_kernel_nofault()'s overhead.
> +        */
> +       if ((fentry_ip & ~PAGE_MASK) < AARCH64_INSN_SIZE * 2) {
> +               if (get_kernel_nofault(insn, (u32 *)(fentry_ip - AARCH64_INSN_SIZE * 2)))
> +                       return fentry_ip - AARCH64_INSN_SIZE;
> +       } else {
> +               insn = *(u32 *)(fentry_ip - AARCH64_INSN_SIZE * 2);
> +       }
> +
> +       if (aarch64_insn_is_bti(le32_to_cpu((__le32)insn)))
> +               return fentry_ip - AARCH64_INSN_SIZE * 2;
> +
> +       return fentry_ip - AARCH64_INSN_SIZE;
> +}
>  #else
>  #define get_entry_ip(fentry_ip) fentry_ip
>  #endif
> --
> 2.34.1
>
>
>
> --
> Masami Hiramatsu (Google) <mhiramat@xxxxxxxxxx>