On Tue, Oct 31, 2023 at 7:41 PM Chuyi Zhou <zhouchuyi@xxxxxxxxxxxxx> wrote: > > Hello, > > 在 2023/11/1 06:06, Alexei Starovoitov 写道: > > On Tue, Oct 31, 2023 at 4:38 AM Chuyi Zhou <zhouchuyi@xxxxxxxxxxxxx> wrote: > >> > >> > >> So, maybe another possible solution is: > >> > >> diff --git a/kernel/bpf/cgroup_iter.c b/kernel/bpf/cgroup_iter.c > >> index 209e5135f9fb..72a6778e3fba 100644 > >> --- a/kernel/bpf/cgroup_iter.c > >> +++ b/kernel/bpf/cgroup_iter.c > >> @@ -282,7 +282,7 @@ static struct bpf_iter_reg bpf_cgroup_reg_info = { > >> .ctx_arg_info_size = 1, > >> .ctx_arg_info = { > >> { offsetof(struct bpf_iter__cgroup, cgroup), > >> - PTR_TO_BTF_ID_OR_NULL }, > >> + PTR_TO_BTF_ID_OR_NULL | MEM_RCU }, > >> }, > >> .seq_info = &cgroup_iter_seq_info, > >> }; > >> diff --git a/kernel/bpf/task_iter.c b/kernel/bpf/task_iter.c > >> index 59e747938bdb..4fd3f734dffd 100644 > >> --- a/kernel/bpf/task_iter.c > >> +++ b/kernel/bpf/task_iter.c > >> @@ -706,7 +706,7 @@ static struct bpf_iter_reg task_reg_info = { > >> .ctx_arg_info_size = 1, > >> .ctx_arg_info = { > >> { offsetof(struct bpf_iter__task, task), > >> - PTR_TO_BTF_ID_OR_NULL }, > >> + PTR_TO_BTF_ID_OR_NULL | PTR_TRUSTED }, > > > > Yep. That looks good. > > bpf_cgroup_reg_info -> cgroup is probably PTR_TRUSTED too. > > Not sure... why did you go with MEM_RCU there ? > > hmm... > > That is because in our previous discussion, you suggested we'd better > add BTF_TYPE_SAFE_RCU_OR_NULL(struct bpf_iter__cgroup) {...} I mentioned that because we don't have BTF_TYPE_SAFE_TRUSTED_OR_NULL. and cgroup pointer can be NULL, but since you found a cleaner way we can do PTR_TO_BTF_ID_OR_NULL | PTR_TRUSTED. > I didn't think too much about it. But I noticed that we only use > cgroup_mutex to protect the iteration in cgroup_iter.c. > > Looking at cgroup_kn_lock_live() in kernel/cgroup/cgroup.c, it would use > cgroup_tryget()/cgroup_is_dead() to check whether the cgrp is 'dead'. > cgroup_tryget() seems is equal to bpf_cgroup_acquire(). So, maybe let's > return a 'rcu' cgrp pointer. If BPF Prog want stronger guarantee of > cgrp, just use bpf_cgroup_acquire(). and that would be misleading. MEM_RCU means that the pointer is valid, but it could have refcnt == 0, while PTR_TRUSTED means that it's good to use as-is. Here cgroup pointer is trusted. It's not a dead cgroup. See kernel/bpf/cgroup_iter.c __cgroup_iter_seq_show(). bpf prog doesn't need to call bpf_cgroup_acquire.