On Wed, Jan 24, 2024 at 2:26 AM David Vernet <void@xxxxxxxxxxxxx> wrote: > > On Tue, Jan 23, 2024 at 11:27:14PM +0800, Yafang Shao wrote: > > Add three new kfuncs for bpf_iter_cpumask. > > - bpf_iter_cpumask_new > > KF_RCU is defined because the cpumask must be a RCU trusted pointer > > such as task->cpus_ptr. > > - bpf_iter_cpumask_next > > - bpf_iter_cpumask_destroy > > > > These new kfuncs facilitate the iteration of percpu data, such as > > runqueues, psi_cgroup_cpu, and more. > > > > Signed-off-by: Yafang Shao <laoar.shao@xxxxxxxxx> > > Thanks for working on this, this will be nice to have! > > > --- > > kernel/bpf/cpumask.c | 82 ++++++++++++++++++++++++++++++++++++++++++++ > > 1 file changed, 82 insertions(+) > > > > diff --git a/kernel/bpf/cpumask.c b/kernel/bpf/cpumask.c > > index 2e73533a3811..474072a235d6 100644 > > --- a/kernel/bpf/cpumask.c > > +++ b/kernel/bpf/cpumask.c > > @@ -422,6 +422,85 @@ __bpf_kfunc u32 bpf_cpumask_weight(const struct cpumask *cpumask) > > return cpumask_weight(cpumask); > > } > > > > +struct bpf_iter_cpumask { > > + __u64 __opaque[2]; > > +} __aligned(8); > > + > > +struct bpf_iter_cpumask_kern { > > + struct cpumask *mask; > > + int cpu; > > +} __aligned(8); > > Why do we need both of these if we're not going to put the opaque > iterator in UAPI? Good point! Will remove it. It aligns with the pattern seen in bpf_iter_{css,task,task_vma,task_css}_kern, suggesting that we should indeed eliminate them. > > > + > > +/** > > + * bpf_iter_cpumask_new() - Create a new bpf_iter_cpumask for a specified cpumask > > + * @it: The new bpf_iter_cpumask to be created. > > + * @mask: The cpumask to be iterated over. > > + * > > + * This function initializes a new bpf_iter_cpumask structure for iterating over > > + * the specified CPU mask. It assigns the provided cpumask to the newly created > > + * bpf_iter_cpumask @it for subsequent iteration operations. > > + * > > + * On success, 0 is returen. On failure, ERR is returned. > > + */ > > +__bpf_kfunc int bpf_iter_cpumask_new(struct bpf_iter_cpumask *it, const struct cpumask *mask) > > +{ > > + struct bpf_iter_cpumask_kern *kit = (void *)it; > > + > > + BUILD_BUG_ON(sizeof(struct bpf_iter_cpumask_kern) > sizeof(struct bpf_iter_cpumask)); > > + BUILD_BUG_ON(__alignof__(struct bpf_iter_cpumask_kern) != > > + __alignof__(struct bpf_iter_cpumask)); > > Why are we checking > in the first expression instead of just plain > equality? Similar to the previous case, it aligns with others. Once we eliminate the struct bpf_iter_cpumask_kern, we can safely discard these BUILD_BUG_ON() statements as well. > > > + > > + kit->mask = bpf_mem_alloc(&bpf_global_ma, sizeof(struct cpumask)); > > Probably better to use cpumask_size() here. will use it. > > > + if (!kit->mask) > > + return -ENOMEM; > > + > > + cpumask_copy(kit->mask, mask); > > + kit->cpu = -1; > > + return 0; > > +} > > + > > +/** > > + * bpf_iter_cpumask_next() - Get the next CPU in a bpf_iter_cpumask > > + * @it: The bpf_iter_cpumask > > + * > > + * This function retrieves a pointer to the number of the next CPU within the > > + * specified bpf_iter_cpumask. It allows sequential access to CPUs within the > > + * cpumask. If there are no further CPUs available, it returns NULL. > > + * > > + * Returns a pointer to the number of the next CPU in the cpumask or NULL if no > > + * further CPUs. > > + */ > > +__bpf_kfunc int *bpf_iter_cpumask_next(struct bpf_iter_cpumask *it) > > +{ > > + struct bpf_iter_cpumask_kern *kit = (void *)it; > > + const struct cpumask *mask = kit->mask; > > + int cpu; > > + > > + if (!mask) > > + return NULL; > > + cpu = cpumask_next(kit->cpu, mask); > > + if (cpu >= nr_cpu_ids) > > + return NULL; > > + > > + kit->cpu = cpu; > > + return &kit->cpu; > > +} > > + > > +/** > > + * bpf_iter_cpumask_destroy() - Destroy a bpf_iter_cpumask > > + * @it: The bpf_iter_cpumask to be destroyed. > > + * > > + * Destroy the resource assiciated with the bpf_iter_cpumask. > > + */ > > +__bpf_kfunc void bpf_iter_cpumask_destroy(struct bpf_iter_cpumask *it) > > +{ > > + struct bpf_iter_cpumask_kern *kit = (void *)it; > > + > > + if (!kit->mask) > > + return; > > + bpf_mem_free(&bpf_global_ma, kit->mask); > > +} > > + > > __bpf_kfunc_end_defs(); > > > > BTF_SET8_START(cpumask_kfunc_btf_ids) > > @@ -450,6 +529,9 @@ BTF_ID_FLAGS(func, bpf_cpumask_copy, KF_RCU) > > BTF_ID_FLAGS(func, bpf_cpumask_any_distribute, KF_RCU) > > BTF_ID_FLAGS(func, bpf_cpumask_any_and_distribute, KF_RCU) > > BTF_ID_FLAGS(func, bpf_cpumask_weight, KF_RCU) > > +BTF_ID_FLAGS(func, bpf_iter_cpumask_new, KF_ITER_NEW | KF_RCU) > > +BTF_ID_FLAGS(func, bpf_iter_cpumask_next, KF_ITER_NEXT | KF_RET_NULL) > > +BTF_ID_FLAGS(func, bpf_iter_cpumask_destroy, KF_ITER_DESTROY) > > BTF_SET8_END(cpumask_kfunc_btf_ids) > > > > static const struct btf_kfunc_id_set cpumask_kfunc_set = { > > -- > > 2.39.1 > > > > -- Regards Yafang