Yonghong Song <yhs@xxxxxx> writes: > On 11/18/21 3:11 AM, Toke Høiland-Jørgensen wrote: >> Joanne Koong <joannekoong@xxxxxx> writes: >> >>> This patch adds the kernel-side and API changes for a new helper >>> function, bpf_for_each: >>> >>> long bpf_for_each(u32 nr_interations, void *callback_fn, >>> void *callback_ctx, u64 flags); >>> >>> bpf_for_each invokes the "callback_fn" nr_iterations number of times >>> or until the callback_fn returns 1. >>> >>> A few things to please note: >>> ~ The "u64 flags" parameter is currently unused but is included in >>> case a future use case for it arises. >>> ~ In the kernel-side implementation of bpf_for_each (kernel/bpf/bpf_iter.c), >>> bpf_callback_t is used as the callback function cast. >>> ~ A program can have nested bpf_for_each calls but the program must >>> still adhere to the verifier constraint of its stack depth (the stack depth >>> cannot exceed MAX_BPF_STACK)) >>> ~ The next patch will include the tests and benchmark >>> >>> Signed-off-by: Joanne Koong <joannekoong@xxxxxx> >> >> Great to see this! One small nit, below, but otherwise: >> >> Acked-by: Toke Høiland-Jørgensen <toke@xxxxxxxxxx> >> >> >>> --- >>> include/linux/bpf.h | 1 + >>> include/uapi/linux/bpf.h | 23 +++++++++++++++++++++++ >>> kernel/bpf/bpf_iter.c | 32 ++++++++++++++++++++++++++++++++ >>> kernel/bpf/helpers.c | 2 ++ >>> kernel/bpf/verifier.c | 28 ++++++++++++++++++++++++++++ >>> tools/include/uapi/linux/bpf.h | 23 +++++++++++++++++++++++ >>> 6 files changed, 109 insertions(+) >>> >>> diff --git a/include/linux/bpf.h b/include/linux/bpf.h >>> index 6deebf8bf78f..d9b69a896c91 100644 >>> --- a/include/linux/bpf.h >>> +++ b/include/linux/bpf.h >>> @@ -2107,6 +2107,7 @@ extern const struct bpf_func_proto bpf_get_socket_ptr_cookie_proto; >>> extern const struct bpf_func_proto bpf_task_storage_get_proto; >>> extern const struct bpf_func_proto bpf_task_storage_delete_proto; >>> extern const struct bpf_func_proto bpf_for_each_map_elem_proto; >>> +extern const struct bpf_func_proto bpf_for_each_proto; >>> extern const struct bpf_func_proto bpf_btf_find_by_name_kind_proto; >>> extern const struct bpf_func_proto bpf_sk_setsockopt_proto; >>> extern const struct bpf_func_proto bpf_sk_getsockopt_proto; >>> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h >>> index bd0c9f0487f6..ea5098920ed2 100644 >>> --- a/include/uapi/linux/bpf.h >>> +++ b/include/uapi/linux/bpf.h >>> @@ -4750,6 +4750,28 @@ union bpf_attr { >>> * The number of traversed map elements for success, **-EINVAL** for >>> * invalid **flags**. >>> * >>> + * long bpf_for_each(u32 nr_iterations, void *callback_fn, void *callback_ctx, u64 flags) >>> + * Description >>> + * For **nr_iterations**, call **callback_fn** function with >>> + * **callback_ctx** as the context parameter. >>> + * The **callback_fn** should be a static function and >>> + * the **callback_ctx** should be a pointer to the stack. >>> + * The **flags** is used to control certain aspects of the helper. >>> + * Currently, the **flags** must be 0. >>> + * >>> + * long (\*callback_fn)(u32 index, void \*ctx); >>> + * >>> + * where **index** is the current index in the iteration. The index >>> + * is zero-indexed. >>> + * >>> + * If **callback_fn** returns 0, the helper will continue to the next >>> + * iteration. If return value is 1, the helper will skip the rest of >>> + * the iterations and return. Other return values are not used now. >> >> The code will actually return for any non-zero value, though? So >> shouldn't the documentation reflect this? Or, alternatively, should the >> verifier enforce that the function can only return 0 or 1? > > This is enforced in verifier.c prepare_func_exit(). > > if (callee->in_callback_fn) { > /* enforce R0 return value range [0, 1]. */ > struct tnum range = tnum_range(0, 1); > > if (r0->type != SCALAR_VALUE) { > verbose(env, "R0 not a scalar value\n"); > return -EACCES; > } > if (!tnum_in(range, r0->var_off)) { > verbose_invalid_scalar(env, r0, &range, > "callback return", "R0"); > return -EINVAL; > } > } Ah, right! I went looking for this but couldn't find it - thanks for the pointer! -Toke