Hello,
在 2023/9/6 03:02, Alexei Starovoitov 写道:
On Sun, Aug 27, 2023 at 12:21 AM Chuyi Zhou <zhouchuyi@xxxxxxxxxxxxx> wrote:
This Patch adds kfuncs bpf_iter_css_task_{new,next,destroy} which allow
creation and manipulation of struct bpf_iter_css_task in open-coded
iterator style. These kfuncs actually wrapps
css_task_iter_{start,next,end}. BPF programs can use these kfuncs through
bpf_for_each macro for iteration of all tasks under a css.
Signed-off-by: Chuyi Zhou <zhouchuyi@xxxxxxxxxxxxx>
---
include/uapi/linux/bpf.h | 4 ++++
kernel/bpf/helpers.c | 3 +++
kernel/bpf/task_iter.c | 39 ++++++++++++++++++++++++++++++++++
tools/include/uapi/linux/bpf.h | 4 ++++
tools/lib/bpf/bpf_helpers.h | 7 ++++++
5 files changed, 57 insertions(+)
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 60a9d59beeab..2a6e9b99564b 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -7195,4 +7195,8 @@ struct bpf_iter_num {
__u64 __opaque[1];
} __attribute__((aligned(8)));
+struct bpf_iter_css_task {
+ __u64 __opaque[1];
+} __attribute__((aligned(8)));
+
#endif /* _UAPI__LINUX_BPF_H__ */
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index 9e80efa59a5d..cf113ad24837 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -2455,6 +2455,9 @@ BTF_ID_FLAGS(func, bpf_dynptr_slice_rdwr, KF_RET_NULL)
BTF_ID_FLAGS(func, bpf_iter_num_new, KF_ITER_NEW)
BTF_ID_FLAGS(func, bpf_iter_num_next, KF_ITER_NEXT | KF_RET_NULL)
BTF_ID_FLAGS(func, bpf_iter_num_destroy, KF_ITER_DESTROY)
+BTF_ID_FLAGS(func, bpf_iter_css_task_new, KF_ITER_NEW)
+BTF_ID_FLAGS(func, bpf_iter_css_task_next, KF_ITER_NEXT | KF_RET_NULL)
+BTF_ID_FLAGS(func, bpf_iter_css_task_destroy, KF_ITER_DESTROY)
BTF_ID_FLAGS(func, bpf_dynptr_adjust)
BTF_ID_FLAGS(func, bpf_dynptr_is_null)
BTF_ID_FLAGS(func, bpf_dynptr_is_rdonly)
diff --git a/kernel/bpf/task_iter.c b/kernel/bpf/task_iter.c
index c4ab9d6cdbe9..b1bdba40b684 100644
--- a/kernel/bpf/task_iter.c
+++ b/kernel/bpf/task_iter.c
@@ -823,6 +823,45 @@ const struct bpf_func_proto bpf_find_vma_proto = {
.arg5_type = ARG_ANYTHING,
};
+struct bpf_iter_css_task_kern {
+ struct css_task_iter *css_it;
+} __attribute__((aligned(8)));
+
+__bpf_kfunc int bpf_iter_css_task_new(struct bpf_iter_css_task *it,
+ struct cgroup_subsys_state *css, unsigned int flags)
+{
+ struct bpf_iter_css_task_kern *kit = (void *)it;
+
+ BUILD_BUG_ON(sizeof(struct bpf_iter_css_task_kern) != sizeof(struct bpf_iter_css_task));
+ BUILD_BUG_ON(__alignof__(struct bpf_iter_css_task_kern) !=
+ __alignof__(struct bpf_iter_css_task));
+
+ kit->css_it = kzalloc(sizeof(struct css_task_iter), GFP_KERNEL);
+ if (!kit->css_it)
+ return -ENOMEM;
+ css_task_iter_start(css, flags, kit->css_it);
Some of the flags are internal. Like CSS_TASK_ITER_SKIPPED.
The kfunc should probably only allow CSS_TASK_ITER_PROCS |
CSS_TASK_ITER_THREADED,
and not CSS_TASK_ITER_THREADED alone.
Since they're #define-s it's not easy for bpf prog to use them.
I think would be good to have a pre-patch that converts them to enum,
so that bpf prog can take them from vmlinux.h.
But the main issue of the patch that it adds this iter to common kfuncs.
That's not safe, since css_task_iter_*() does spin_unlock_irq() which
might screw up irq flags depending on the context where bpf prog is running.
Can css_task_iter internals switch to irqsave/irqrestore?
Yes, I think so. Switching to irqsave/irqrestore is no harm.
css_set_lock is also global, so the bpf side has to be careful in
where it allows to use this iter.
bpf_lsm hooks are safe, most of bpf iter-s are safe too.
Future bpf-oom hooks are probably safe as well.
We probably need an allowlist here.
What should we do if we want to make a allowlist?
Do you mean we need to check prog_type or attach_type when we call these
kfuncs in BPF verifier? If so, we should add a new attach_type or
prog_type for bpf-oom in the feature so we can know the current BPF
program is hooking for OOM Policy.