Re: [PATCH bpf-next] libbpf: Allow Golang symbols in uprobe secdef

Hengqi Chen <hengqi.chen@xxxxxxxxx> · Wed, 27 Sep 2023 10:17:42 +0800

On Tue, Sep 26, 2023 at 7:15 AM Andrii Nakryiko
<andrii.nakryiko@xxxxxxxxx> wrote:
>
> On Sun, Sep 24, 2023 at 8:19 PM Hengqi Chen <hengqi.chen@xxxxxxxxx> wrote:
> >
> > Golang symbols in ELF files are different from C/C++
> > which contains special characters like '*', '(' and ')'.
> > With generics, things get more complicated, there are
> > symbols like:
> >
> >   github.com/cilium/ebpf/internal.(*Deque[go.shape.interface {
> >    Format(fmt.State, int32); TypeName() string;
> >   github.com/cilium/ebpf/btf.copy() github.com/cilium/ebpf/btf.Type
> >   }]).Grow
> >
> > Add " ()*,-/;[]{}" (in alphabetical order) to support matching
> > against such symbols. Note that ']' and '-' should be the first
> > and last characters in the %m range as sscanf required.
> >
> > A working example can be found at this repo ([0]).
> >
> >   [0]: https://github.com/chenhengqi/libbpf-go-symbols
> >
> > Suggested-by: Andrii Nakryiko <andrii@xxxxxxxxxx>
> > Cc: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx>
> > Signed-off-by: Hengqi Chen <hengqi.chen@xxxxxxxxx>
> > ---
> >  tools/lib/bpf/libbpf.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> > index b4758e54a815..de0e068195ab 100644
> > --- a/tools/lib/bpf/libbpf.c
> > +++ b/tools/lib/bpf/libbpf.c
> > @@ -11630,7 +11630,7 @@ static int attach_uprobe(const struct bpf_program *prog, long cookie, struct bpf
> >
> >         *link = NULL;
> >
> > -       n = sscanf(prog->sec_name, "%m[^/]/%m[^:]:%m[a-zA-Z0-9_.@]+%li",
> > +       n = sscanf(prog->sec_name, "%m[^/]/%m[^:]:%m[]a-zA-Z0-9 ()*,./;@[_{}-]+%li",
>
> This is almost incomprehensible now... wouldn't it be clearer to just
> have a catch-all %ms at the end, and then internally checking if we
> have '+%li'? I.e., once we match everything after
> "uprobe/<path-to-binary>:", we can strchr('+'), if found, try
> sscanf("%li") on the remaining suffix. If that doesn't parse properly,
> then we have a choice -- either error out, or just assume that
> `+<something>` part is just a part of ELF symbol name?
>
> That way we don't hard-code any fixes set of symbols and avoid any
> future crazy adjustments.
>
> WDYT?

Sounds good. This also solves the matching of unicode identifiers.

As Jiri mentioned above, %ms won't match whitespaces,
so I am wondering if %m[^\n] is acceptable.

>
> >                    &probe_type, &binary_path, &func_name, &offset);
> >         switch (n) {
> >         case 1:
> > --
> > 2.34.1
> >