On Wed, Aug 24, 2022 at 5:16 PM Kui-Feng Lee <kuifeng@xxxxxx> wrote: > > On Wed, 2022-08-24 at 15:20 -0700, Andrii Nakryiko wrote: > > On Fri, Aug 19, 2022 at 3:09 PM Kui-Feng Lee <kuifeng@xxxxxx> wrote: > > > > > > Allow creating an iterator that loops through resources of one > > > task/thread. > > > > > > People could only create iterators to loop through all resources of > > > files, vma, and tasks in the system, even though they were > > > interested > > > in only the resources of a specific task or process. Passing the > > > additional parameters, people can now create an iterator to go > > > through all resources or only the resources of a task. > > > > > > Signed-off-by: Kui-Feng Lee <kuifeng@xxxxxx> > > > --- > > > include/linux/bpf.h | 25 +++++++ > > > include/uapi/linux/bpf.h | 6 ++ > > > kernel/bpf/task_iter.c | 116 ++++++++++++++++++++++++++--- > > > ---- > > > tools/include/uapi/linux/bpf.h | 6 ++ > > > 4 files changed, 129 insertions(+), 24 deletions(-) > > > > > > diff --git a/include/linux/bpf.h b/include/linux/bpf.h > > > index 39bd36359c1e..59712dd917d8 100644 > > > --- a/include/linux/bpf.h > > > +++ b/include/linux/bpf.h > > > @@ -1729,8 +1729,33 @@ int bpf_obj_get_user(const char __user > > > *pathname, int flags); > > > extern int bpf_iter_ ## target(args); \ > > > int __init bpf_iter_ ## target(args) { return 0; } > > > > > > +/* > > > + * The task type of iterators. > > > + * > > > + * For BPF task iterators, they can be parameterized with various > > > + * parameters to visit only some of tasks. > > > + * > > > + * BPF_TASK_ITER_ALL (default) > > > + * Iterate over resources of every task. > > > + * > > > + * BPF_TASK_ITER_TID > > > + * Iterate over resources of a task/tid. > > > + * > > > + * BPF_TASK_ITER_TGID > > > + * Iterate over reosurces of evevry task of a process / task > > > group. > > > > typos: resources, every > > > > > + */ > > > +enum bpf_iter_task_type { > > > + BPF_TASK_ITER_ALL = 0, > > > + BPF_TASK_ITER_TID, > > > + BPF_TASK_ITER_TGID, > > > +}; > > > + > > > > [...] > > > > > rcu_read_lock(); > > > retry: > > > - pid = find_ge_pid(*tid, ns); > > > + pid = find_ge_pid(*tid, common->ns); > > > if (pid) { > > > - *tid = pid_nr_ns(pid, ns); > > > + *tid = pid_nr_ns(pid, common->ns); > > > task = get_pid_task(pid, PIDTYPE_PID); > > > if (!task) { > > > ++*tid; > > > goto retry; > > > - } else if (skip_if_dup_files && > > > !thread_group_leader(task) && > > > - task->files == task->group_leader- > > > >files) { > > > + } else if ((skip_if_dup_files && > > > !thread_group_leader(task) && > > > + task->files == task->group_leader- > > > >files) || > > > + (common->type == BPF_TASK_ITER_TGID && > > > + __task_pid_nr_ns(task, PIDTYPE_TGID, > > > common->ns) != common->pid)) { > > > > it gets super hard to follow this logic, would a simple helper > > function to calculate this condition (and maybe some comments to > > explain the logic behind these checks?) make it a bit more readable? > > !matched_task(task, common, skip_if_dup_file)? > > bool matched_task(struct task_struct *task, > struct bpf_iter_seq_task_common *common, > bool skip_if_dup_file) { > /* Should not have the same 'files' if skip_if_dup_file is true > */ > bool diff_files_if = > !skip_if_dup_file || > (thread_group_leader(task) && > task->file != task->gorup_leader->fies); > /* Should have the given tgid if the type is BPF_TASK_ITER_TGI > */ > bool have_tgid_if = > common->type != BPF_TASK_ITER_TGID || > __task_pid_nr_ns(task, PIDTYPE_TGID, > common->ns) == common->pid; > return diff_files_if && have_tgid_if; > } > > How about this? > Hm... "matched_task" doesn't mean much, tbh, so not really. I wanted to suggest having a separate helper just for your TGID check and call it something more meaningful like "task_belongs_to_tgid". Can't come up with a good name for existing dup_file check, so I'd probably keep it as is. But also seems like there is same_thread_group() helper in include/linux/sched/signal.h, so let's look if we can use it, it seems like it's just comparing signal pointers (probably quite faster than what you are doing right now). But looking at this some more made me realize that even if we specify pid (tgid in kernel terms) we are still going to iterate through all the tasks, essentially. Is that right? So TGID mode isn't great for speeding up, we should point out to users that if they want to iterate files of the process, they probably want to use TID mode and set tid to pid to use the early termination condition in TID. It wasn't obvious to me until I re-read this patch like 3 times and wrote three different replies here :) But then I also went looking at what procfs doing for /proc/<pid/task/* dirs. It does seem like there are faster ways to iterate all threads of a process. See next_tid() which uses next_thread(), etc. Can you please check those and see if we can have faster in-process iteration? > > > > > put_task_struct(task); > > > task = NULL; > > > ++*tid; > > > @@ -56,7 +73,7 @@ static void *task_seq_start(struct seq_file *seq, > > > loff_t *pos) > > > struct bpf_iter_seq_task_info *info = seq->private; > > > struct task_struct *task; > > > > > > - task = task_seq_get_next(info->common.ns, &info->tid, > > > false); > > > + task = task_seq_get_next(&info->common, &info->tid, false); > > > if (!task) > > > return NULL; > > > [...]