Re: [PATCH v2] tracing/uprobe: Add missing PID filter for uretprobe

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Sep 02, 2024 at 03:22:25AM +0800, Tianyi Liu wrote:
> On Fri, Aug 30, 2024 at 18:12:41PM +0800, Oleg Nesterov wrote:
> > The whole discussion was very confusing (yes, I too contributed to the
> > confusion ;), let me try to summarise.
> > 
> > > U(ret)probes are designed to be filterable using the PID, which is the
> > > second parameter in the perf_event_open syscall. Currently, uprobe works
> > > well with the filtering, but uretprobe is not affected by it.
> > 
> > And this is correct. But the CONFIG_BPF_EVENTS code in __uprobe_perf_func()
> > misunderstands the purpose of uprobe_perf_filter().
> > 
> > Lets forget about BPF for the moment. It is not that uprobe_perf_filter()
> > does the filtering by the PID, it doesn't. We can simply kill this function
> > and perf will work correctly. The perf layer in __uprobe_perf_func() does
> > the filtering when perf_event->hw.target != NULL.
> > 
> > So why does uprobe_perf_filter() call uprobe_perf_filter()? Not to avoid
> > the __uprobe_perf_func() call (as the BPF code assumes), but to trigger
> > unapply_uprobe() in handler_chain().
> > 
> > Suppose you do, say,
> > 
> > 	$ perf probe -x /path/to/libc some_hot_function
> > or
> > 	$ perf probe -x /path/to/libc some_hot_function%return
> > then
> > 	$perf record -e ... -p 1
> > 
> > to trace the usage of some_hot_function() in the init process. Everything
> > will work just fine if we kill uprobe_perf_filter()->uprobe_perf_filter().
> > 
> > But. If INIT forks a child C, dup_mm() will copy int3 installed by perf.
> > So the child C will hit this breakpoint and cal handle_swbp/etc for no
> > reason every time it calls some_hot_function(), not good.
> > 
> > That is why uprobe_perf_func() calls uprobe_perf_filter() which returns
> > UPROBE_HANDLER_REMOVE when C hits the breakpoint. handler_chain() will
> > call unapply_uprobe() which will remove this breakpoint from C->mm.
> > 
> > > We found that the filter function was not invoked when uretprobe was
> > > initially implemented, and this has been existing for ten years.
> > 
> > See above, this is correct.
> > 
> > Note also that if you only use perf-probe + perf-record, no matter how
> > many instances, you can even add BUG_ON(!uprobe_perf_filter(...)) into
> > uretprobe_perf_func(). IIRC, perf doesn't use create_local_trace_uprobe().
> > 
> 
> Thanks for the detailed explanation above, I can understand the code now. 
> Yes, I completely misunderstood the purpose of uprobe_perf_filter, 
> sorry for the confusion.
> 
> > ---------------------------------------------------------------------------
> > Now lets return to BPF and this particular problem. I won't really argue
> > with this patch, but
> > 
> > 	- Please change the subject and update the changelog,
> > 	  "Fixes: c1ae5c75e103" and the whole reasoning is misleading
> > 	  and wrong, IMO.
> > 
> > 	- This patch won't fix all problems because uprobe_perf_filter()
> > 	  filters by mm, not by pid. The changelog/patch assumes that it
> > 	  is a "PID filter", but it is not.
> > 
> > 	  See https://lore.kernel.org/linux-trace-kernel/20240825224018.GD3906@xxxxxxxxxx/
> > 	  If the traced process does clone(CLONE_VM), bpftrace will hit the
> > 	  similar problem, with uprobe or uretprobe.
> > 
> > 	- So I still think that the "right" fix should change the
> > 	  bpf_prog_run_array_uprobe() paths somehow, but I know nothing
> > 	  about bpf.
> 
> I agree that this patch does not address the issue correctly. 
> The PID filter should be implemented within bpf_prog_run_array_uprobe, 
> or alternatively, bpf_prog_run_array_uprobe should be called after 
> perf_tp_event_match to reuse the filtering mechanism provided by perf.
> 
> Also, uretprobe may need UPROBE_HANDLER_REMOVE, similar to uprobe.
> 
> For now, please forget the original patch as we need a new solution ;)

hi,
any chance we could go with your fix until we find better solution?

it's simple and it fixes most of the cases for return uprobe pid filter
for events with bpf programs.. I know during the discussion we found
that standard perf record path won't work if there's bpf program
attached on the same event, but I think that likely it needs more
changes and also it's not a common use case

would you consider sending another version addressing Oleg's points
for changelog above?

thanks,
jirka




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux