On Tue, Aug 9, 2022 at 3:35 PM Kui-Feng Lee <kuifeng@xxxxxx> wrote: > > On Tue, 2022-08-09 at 15:12 -0700, Alexei Starovoitov wrote: > > On Tue, Aug 9, 2022 at 12:54 PM Kui-Feng Lee <kuifeng@xxxxxx> wrote: > > > > > > Allow creating an iterator that loops through resources of one > > > task/thread. > > > > > > People could only create iterators to loop through all resources of > > > files, vma, and tasks in the system, even though they were > > > interested > > > in only the resources of a specific task or process. Passing the > > > additional parameters, people can now create an iterator to go > > > through all resources or only the resources of a task. > > > > > > Signed-off-by: Kui-Feng Lee <kuifeng@xxxxxx> > > > --- > > > include/linux/bpf.h | 8 ++ > > > include/uapi/linux/bpf.h | 36 +++++++++ > > > kernel/bpf/task_iter.c | 134 +++++++++++++++++++++++++++-- > > > ---- > > > tools/include/uapi/linux/bpf.h | 36 +++++++++ > > > 4 files changed, 190 insertions(+), 24 deletions(-) > > > > > > diff --git a/include/linux/bpf.h b/include/linux/bpf.h > > > index 11950029284f..bef81324e5f1 100644 > > > --- a/include/linux/bpf.h > > > +++ b/include/linux/bpf.h > > > @@ -1718,6 +1718,14 @@ int bpf_obj_get_user(const char __user > > > *pathname, int flags); > > > > > > struct bpf_iter_aux_info { > > > struct bpf_map *map; > > > + struct { > > > + enum bpf_iter_task_type type; > > > + union { > > > + u32 tid; > > > + u32 tgid; > > > + u32 pid_fd; > > > + }; > > > + } task; > > > }; > > > > > > typedef int (*bpf_iter_attach_target_t)(struct bpf_prog *prog, > > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h > > > index ffcbf79a556b..3d0b9e34089f 100644 > > > --- a/include/uapi/linux/bpf.h > > > +++ b/include/uapi/linux/bpf.h > > > @@ -87,10 +87,46 @@ struct bpf_cgroup_storage_key { > > > __u32 attach_type; /* program attach type > > > (enum bpf_attach_type) */ > > > }; > > > > > > +/* > > > + * The task type of iterators. > > > + * > > > + * For BPF task iterators, they can be parameterized with various > > > + * parameters to visit only some of tasks. > > > + * > > > + * BPF_TASK_ITER_ALL (default) > > > + * Iterate over resources of every task. > > > + * > > > + * BPF_TASK_ITER_TID > > > + * Iterate over resources of a task/tid. > > > + * > > > + * BPF_TASK_ITER_TGID > > > + * Iterate over reosurces of evevry task of a process / task > > > group. > > > + * > > > + * BPF_TASK_ITER_PIDFD > > > + * Iterate over resources of every task of a process /task > > > group specified by a pidfd. > > > + */ > > > +enum bpf_iter_task_type { > > > + BPF_TASK_ITER_ALL = 0, > > > + BPF_TASK_ITER_TID, > > > + BPF_TASK_ITER_TGID, > > > + BPF_TASK_ITER_PIDFD, > > > +}; > > > + > > > union bpf_iter_link_info { > > > struct { > > > __u32 map_fd; > > > } map; > > > + /* > > > + * Parameters of task iterators. > > > + */ > > > + struct { > > > + enum bpf_iter_task_type type; > > > + union { > > > + __u32 tid; > > > + __u32 tgid; > > > + __u32 pid_fd; > > > + }; > > > > Sorry I'm late to this discussion, but > > with enum and with union we kinda tell > > the kernel the same information twice. > > Here is how the selftest looks: > > + linfo.task.tid = getpid(); > > + linfo.task.type = BPF_TASK_ITER_TID; > > > > first line -> use tid. > > second line -> yeah. I really meant the tid. > > > > Instead of union and type can we do: > > > + __u32 tid; > > > + __u32 tgid; > > > + __u32 pid_fd; > > > > as 3 separate fields? > > The kernel would have to check that only one > > of them is set. > > > > I could have missed an earlier discussion on this subj. > > We may have other parameter types later, for example, cgroups. > Unfortunately, we don't have tagged enum or tagged union, like what > Rust or Haskell has, in C. A separated 'type' field would be easier > and clear to distinguish them. With 3 separated fields, people may > wonder if they can be set them all, or just one of them, in my opinion. > With an union, people should know only one of them should be set. What stops us adding new fields to the end in such a case? Some combinations will not be meaningful and the kernel would have to check and error regardless. Imagine extending union: struct { enum bpf_iter_task_type type; union { struct { __u32 tid; __u64 something_else; }; __u32 tgid; __u32 pid_fd; }; }; and now we're suddenly hitting the same issue we discussed with struct bpf_link_info in the other thread due to alignment increasing from 4 to 8 bytes. We might even need bpf_iter_link_info _v2. If 'something_else' is u32 the kernel still needs to check that it's zero in the tgid and pid_fd cases. If we're extending fields we can add a comment: struct { __u32 tid; __u32 tgid; __u32 pid_fd; __u32 something_else; /* useful in combination with tid */ }; and it's obvious what is used with what. It still feels that 3 different fields are easier to use.