Re: [RFC bpf-next 0/3] libbpf: name-based u[ret]probe attach

Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> · Mon, 24 Jan 2022 14:47:21 -0800

On Mon, Jan 24, 2022 at 6:14 AM Alan Maguire <alan.maguire@xxxxxxxxxx> wrote:
>
> On Fri, 21 Jan 2022, Andrii Nakryiko wrote:
>
> > On Thu, Jan 20, 2022 at 5:44 PM Alexei Starovoitov
> > <alexei.starovoitov@xxxxxxxxx> wrote:
> > >
> > > On Thu, Jan 20, 2022 at 3:43 AM Alan Maguire <alan.maguire@xxxxxxxxxx> wrote:
> > > >
> > > > This patch series is a refinement of the RFC patchset [1], focusing
> > > > on support for attach by name for uprobes and uretprobes.  Still
> > > > marked RFC as there are unresolved questions.
> > > >
> > > > Currently attach for such probes is done by determining the offset
> > > > manually, so the aim is to try and mimic the simplicity of kprobe
> > > > attach, making use of uprobe opts to specify a name string.
> > > >
> > > > uprobe attach is done by specifying a binary path, a pid (where
> > > > 0 means "this process" and -1 means "all processes") and an
> > > > offset.  Here a 'func_name' option is added to 'struct uprobe_opts'
> > > > and that name is searched for in symbol tables.  If the binary
> > > > is a program, relative offset calcuation must be done to the
> > > > symbol address as described in [2].
> > >
> > > I think the pid discussion here and in the patches only causes
> > > confusion. I think it's best to remove pid from the api.
> >
> > It's already part of the uprobe API in libbpf
> > (bpf_program__attach_uprobe), but nothing really changes there.
> > API-wise Alan just added an optional func_name option. I think it
> > makes sense overall.
> >
> > For auto-attach it has to be all PIDs, of course.
> >
>
> Makes sense.
>
> > > uprobes are attached to an inode. They're not attached to a pid
> > > or a process. Any existing process or future process started
> > > from that inode (executable file) will have that uprobe triggering.
> > > The kernel can do pid filtering through predicate mechanism,
> > > but bpf uprobe doesn't do any filtering. iirc.
>
> I _think_ there is filtering in uprobe_perf_func() - it calls
> uprobe_perf_filter() prior to calling __uprobe_perf_func(), and
> the latter runs the BPF program. Maybe I'm missing something here
> though? However I think we need to give the user ways to minimize
> the cost of breakpoint placement where possible. See below...
>
> > > Similarly "attach to all processes" doesn't sound right either.
> > > It's attached to all current and future processes.
> >
>
> True, will fix for the next version.
>
> I think for users it'd be good to clarify what the overheads are. If I
> want to see malloc()s in a particular process, say I specify the libc
> path along with the process ID I'm interested in.  This adds the
> breakpoint to libc and will - as far as I understand it - trigger
> breakpoints system-wide which are then filtered out by uprobe perf handling
> for the specific process specified.  That's pretty expensive
> performance-wise, so we should probably try and give users options to
> limit the cost in cases where they don't want to incur system-wide
> overheads. I've been investigating adding support for instrumenting shared
> library calls _within_ programs by placing the breakpoint on the procedure
> linking table call associated with the function.  As this is local to the

You mean to patch PLT stubs ([0])? One concern with that is (besides
making sure that pt_regs still have exactly the same semantics and
stuff) that uprobes are much faster when patching nop instructions (if
the library was compiled with nop "preambles". Do you know if @plt
entries can be compiled with nops as well? The difference in
performance is more than 2x from my non-scientific testing recently.
So it can be a pretty big difference.

  [0] https://www.technovelty.org/linux/plt-and-got-the-key-to-code-sharing-and-dynamic-libraries.html

> program, it will only have an effect on malloc()s in that specific
> program.  So the next version will have a solution which allows us to
> trace malloc() in /usr/bin/foo as well as in libc. Thanks!
>
> Alan