On Mon, Apr 27, 2020 at 1:17 PM Yonghong Song <yhs@xxxxxx> wrote: > > Only the tasks belonging to "current" pid namespace > are enumerated. > > For task/file target, the bpf program will have access to > struct task_struct *task > u32 fd > struct file *file > where fd/file is an open file for the task. > > Signed-off-by: Yonghong Song <yhs@xxxxxx> > --- > kernel/bpf/Makefile | 2 +- > kernel/bpf/task_iter.c | 319 +++++++++++++++++++++++++++++++++++++++++ > 2 files changed, 320 insertions(+), 1 deletion(-) > create mode 100644 kernel/bpf/task_iter.c > [...] > +static void *task_seq_start(struct seq_file *seq, loff_t *pos) > +{ > + struct bpf_iter_seq_task_info *info = seq->private; > + struct task_struct *task; > + u32 id = info->id; > + > + if (*pos == 0) > + info->ns = task_active_pid_ns(current); I wonder why pid namespace is set in start() callback each time, while net_ns was set once when seq_file is created. I think it should be consistent, no? Either pid_ns is another feature and is set consistently just once using the context of the process that creates seq_file, or net_ns could be set using the same method without bpf_iter infra knowing about this feature? Or there are some non-obvious aspects which make pid_ns easier to work with? Either way, process read()'ing seq_file might be different than process open()'ing seq_file, so they might have different namespaces. We need to decide explicitly which context should be used and do it consistently. > + > + task = task_seq_get_next(info->ns, &id); > + if (!task) > + return NULL; > + > + ++*pos; > + info->task = task; > + info->id = id; > + > + return task; > +} > + > +static void *task_seq_next(struct seq_file *seq, void *v, loff_t *pos) > +{ > + struct bpf_iter_seq_task_info *info = seq->private; > + struct task_struct *task; > + > + ++*pos; > + ++info->id; this would make iterator skip pid 0? Is that by design? > + task = task_seq_get_next(info->ns, &info->id); > + if (!task) > + return NULL; > + > + put_task_struct(info->task); on very first iteration info->task might be NULL, right? > + info->task = task; > + return task; > +} > + > +struct bpf_iter__task { > + __bpf_md_ptr(struct bpf_iter_meta *, meta); > + __bpf_md_ptr(struct task_struct *, task); > +}; > + > +int __init __bpf_iter__task(struct bpf_iter_meta *meta, struct task_struct *task) > +{ > + return 0; > +} > + > +static int task_seq_show(struct seq_file *seq, void *v) > +{ > + struct bpf_iter_meta meta; > + struct bpf_iter__task ctx; > + struct bpf_prog *prog; > + int ret = 0; > + > + prog = bpf_iter_get_prog(seq, sizeof(struct bpf_iter_seq_task_info), > + &meta.session_id, &meta.seq_num, > + v == (void *)0); > + if (prog) { can it happen that prog is NULL? > + meta.seq = seq; > + ctx.meta = &meta; > + ctx.task = v; > + ret = bpf_iter_run_prog(prog, &ctx); > + } > + > + return ret == 0 ? 0 : -EINVAL; > +} > + > +static void task_seq_stop(struct seq_file *seq, void *v) > +{ > + struct bpf_iter_seq_task_info *info = seq->private; > + > + if (!v) > + task_seq_show(seq, v); hmm... show() called from stop()? what's the case where this is necessary? > + > + if (info->task) { > + put_task_struct(info->task); > + info->task = NULL; > + } > +} > + [...]