On 08/26, Jiri Olsa wrote: > > On Mon, Aug 26, 2024 at 12:40:18AM +0200, Oleg Nesterov wrote: > > $ ./test & > > $ bpftrace -p $! -e 'uprobe:./test:func { printf("%d\n", pid); }' > > > > I hope that the syntax of the 2nd command is correct... > > > > I _think_ that it will print 2 pids too. > > yes.. but with CLONE_VM both processes share 'mm' Yes sure, > so they are threads, Well this depends on definition ;) but the CLONE_VM child is not a sub-thread, it has another TGID. See below. > and at least uprobe_multi filters by process [1] now.. ;-) OK, if you say that this behaviour is fine I won't argue, I simply do not know. But see below. > > But "perf-record -p" works as expected. > > I wonder it's because there's the perf layer that schedules each > uprobe event only when its process (PID1/2) is scheduled in and will > receive events only from that cpu while the process is running on it Not sure I understand... The task which hits the breakpoint is always current, it is always scheduled in. The main purpose of uprobe_perf_func()->uprobe_perf_filter() is NOT that we want to avoid __uprobe_perf_func() although this makes sense. The main purpose is that we want to remove the breakpoints in current->mm when uprobe_perf_filter() returns false, that is why UPROBE_HANDLER_REMOVE. IOW, the main purpose is not penalise user-space unnecessarily. IIUC (but I am not sure), perf-record -p will work "correctly" even if we remove uprobe_perf_filter() altogether. IIRC the perf layer does its own filtering but I forgot everything. And this makes me think that perhaps BPF can't rely on uprobe_perf_filter() either, even we forget about ret-probes. > [1] 46ba0e49b642 bpf: fix multi-uprobe PID filtering logic Looks obviously wrong... get_pid_task(PIDTYPE_TGID) can return a zombie leader with ->mm == NULL while other threads and thus the whole process is still alive. And again, the changelog says "the intent for PID filtering it to filter by *process*", but clone(CLONE_VM) creates another process, not a thread. So perhaps we need - if (link->task && current->mm != link->task->mm) + if (link->task && !same_thread_group(current, link->task)) in uprobe_prog_run() to make "filter by *process*" true, but this won't fix the problem with link->task->mm == NULL in uprobe_multi_link_filter(). Does bpftrace use bpf_uprobe_multi_link_attach/etc ? I guess not... Then which userspace tool uses this code? ;) Oleg.