Hello.
在 2023/9/15 07:26, Andrii Nakryiko 写道:
On Tue, Sep 12, 2023 at 12:02 AM Chuyi Zhou <zhouchuyi@xxxxxxxxxxxxx> wrote:
This patch adds kfuncs bpf_iter_process_{new,next,destroy} which allow
creation and manipulation of struct bpf_iter_process in open-coded iterator
style. BPF programs can use these kfuncs or through bpf_for_each macro to
iterate all processes in the system.
Signed-off-by: Chuyi Zhou <zhouchuyi@xxxxxxxxxxxxx>
---
include/uapi/linux/bpf.h | 4 ++++
kernel/bpf/helpers.c | 3 +++
kernel/bpf/task_iter.c | 29 +++++++++++++++++++++++++++++
tools/include/uapi/linux/bpf.h | 4 ++++
tools/lib/bpf/bpf_helpers.h | 5 +++++
5 files changed, 45 insertions(+)
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index de02c0971428..befa55b52e29 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -7322,4 +7322,8 @@ struct bpf_iter_css_task {
__u64 __opaque[1];
} __attribute__((aligned(8)));
+struct bpf_iter_process {
+ __u64 __opaque[1];
+} __attribute__((aligned(8)));
+
#endif /* _UAPI__LINUX_BPF_H__ */
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index d6a16becfbb9..9b7d2c6f99d1 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -2507,6 +2507,9 @@ BTF_ID_FLAGS(func, bpf_iter_num_destroy, KF_ITER_DESTROY)
BTF_ID_FLAGS(func, bpf_iter_css_task_new, KF_ITER_NEW)
BTF_ID_FLAGS(func, bpf_iter_css_task_next, KF_ITER_NEXT | KF_RET_NULL)
BTF_ID_FLAGS(func, bpf_iter_css_task_destroy, KF_ITER_DESTROY)
+BTF_ID_FLAGS(func, bpf_iter_process_new, KF_ITER_NEW)
+BTF_ID_FLAGS(func, bpf_iter_process_next, KF_ITER_NEXT | KF_RET_NULL)
+BTF_ID_FLAGS(func, bpf_iter_process_destroy, KF_ITER_DESTROY)
BTF_ID_FLAGS(func, bpf_dynptr_adjust)
BTF_ID_FLAGS(func, bpf_dynptr_is_null)
BTF_ID_FLAGS(func, bpf_dynptr_is_rdonly)
diff --git a/kernel/bpf/task_iter.c b/kernel/bpf/task_iter.c
index d8539cc05ffd..9d1927dc3a06 100644
--- a/kernel/bpf/task_iter.c
+++ b/kernel/bpf/task_iter.c
@@ -851,6 +851,35 @@ __bpf_kfunc void bpf_iter_css_task_destroy(struct bpf_iter_css_task *it)
kfree(kit->css_it);
}
+struct bpf_iter_process_kern {
+ struct task_struct *tsk;
+} __attribute__((aligned(8)));
+
Few high level thoughts. I think it would be good to follow
SEC("iter/task") naming and approach. Open-coded iterators in many
ways are in-kernel counterpart to iterator programs, so keeping them
close enough within reason is useful for knowledge transfer.
SEC("iter/task") allows to:
a) iterate all threads in the system
b) iterate all threads for a given TGID
c) it also allows to "iterate" a single thread or process, but that's
a bit less relevant for in-kernel iterator, but we can still support
them, why not?
I'm not sure if it supports iterating all processes (as in group
leaders of each task group) in the system, but if it's possible I
think we should support it at least for open-coded iterator, seems
like a very useful functionality.
So to that end, let's design a small set of input arguments for
bpf_iter_process_new() that would allow to specify this as flags +
either (optional) struct task_struct * pointer to represent
task/process or PID/TGID.
IIUC, we should define the following task_new kfunc
struct bpf_iter_task_kern {
struct task_struct *start;
struct task_struct *cur;
unsigned int flag;
} __attribute__((aligned(8)));
__bpf_kfunc int bpf_iter_task_new(struct bpf_iter_css_task *it,
struct task_struct *start, unsigned int flags)
If we want to iterate all threads of a task, just pass it to *start*,
and if we want to iterating all process in the system, users may need to
pass a nullptr to the *start*. But it seems current BPF verifier will
reject the nullptr to task_struct. The error message meybe:
"Possibly NULL pointer passed to trusted argx"
I noticed the __opt annotation in kfunc document. It seems with
following we can pass the nullptr to *start*
__bpf_kfunc int bpf_iter_task_new(struct bpf_iter_css_task *it,
void *start__opt, unsigned int flags__szk)
However, in this way, user can pass any invalid pointer to the kfunc
without verifying. Besides, it seems __opt is only allowed to use with
__szk together and flags__szk is ambiguous in semantics.
Do you have better ideas? Or I missing something ?
Thanks