On Mon, Sep 19, 2022 at 5:01 PM David Vernet <void@xxxxxxxxxxxxx> wrote: > > In a prior change, we added a new BPF_MAP_TYPE_USER_RINGBUF map type which > will allow user-space applications to publish messages to a ring buffer > that is consumed by a BPF program in kernel-space. In order for this > map-type to be useful, it will require a BPF helper function that BPF > programs can invoke to drain samples from the ring buffer, and invoke > callbacks on those samples. This change adds that capability via a new BPF > helper function: > > bpf_user_ringbuf_drain(struct bpf_map *map, void *callback_fn, void *ctx, > u64 flags) > > BPF programs may invoke this function to run callback_fn() on a series of > samples in the ring buffer. callback_fn() has the following signature: > > long callback_fn(struct bpf_dynptr *dynptr, void *context); > > Samples are provided to the callback in the form of struct bpf_dynptr *'s, > which the program can read using BPF helper functions for querying > struct bpf_dynptr's. > > In order to support bpf_ringbuf_drain(), a new PTR_TO_DYNPTR register > type is added to the verifier to reflect a dynptr that was allocated by > a helper function and passed to a BPF program. Unlike PTR_TO_STACK > dynptrs which are allocated on the stack by a BPF program, PTR_TO_DYNPTR > dynptrs need not use reference tracking, as the BPF helper is trusted to > properly free the dynptr before returning. The verifier currently only > supports PTR_TO_DYNPTR registers that are also DYNPTR_TYPE_LOCAL. > > Note that while the corresponding user-space libbpf logic will be added > in a subsequent patch, this patch does contain an implementation of the > .map_poll() callback for BPF_MAP_TYPE_USER_RINGBUF maps. This > .map_poll() callback guarantees that an epoll-waiting user-space > producer will receive at least one event notification whenever at least > one sample is drained in an invocation of bpf_user_ringbuf_drain(), > provided that the function is not invoked with the BPF_RB_NO_WAKEUP > flag. If the BPF_RB_FORCE_WAKEUP flag is provided, a wakeup > notification is sent even if no sample was drained. > > Signed-off-by: David Vernet <void@xxxxxxxxxxxxx> > --- > include/linux/bpf.h | 11 +- > include/uapi/linux/bpf.h | 38 +++++++ > kernel/bpf/helpers.c | 2 + > kernel/bpf/ringbuf.c | 181 ++++++++++++++++++++++++++++++++- > kernel/bpf/verifier.c | 61 ++++++++++- > tools/include/uapi/linux/bpf.h | 38 +++++++ > 6 files changed, 320 insertions(+), 11 deletions(-) [...] > #define __BPF_FUNC_MAPPER(FN) \ > FN(unspec), \ > @@ -5599,6 +5636,7 @@ union bpf_attr { > FN(tcp_raw_check_syncookie_ipv4), \ > FN(tcp_raw_check_syncookie_ipv6), \ > FN(ktime_get_tai_ns), \ > + FN(user_ringbuf_drain), \ > /* */ > > /* integer value in 'imm' field of BPF_CALL instruction selects which helper > diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c > index 41aeaf3862ec..66217b1857ca 100644 > --- a/kernel/bpf/helpers.c > +++ b/kernel/bpf/helpers.c > @@ -1627,6 +1627,8 @@ bpf_base_func_proto(enum bpf_func_id func_id) > return &bpf_dynptr_write_proto; > case BPF_FUNC_dynptr_data: > return &bpf_dynptr_data_proto; > + case BPF_FUNC_user_ringbuf_drain: > + return &bpf_user_ringbuf_drain_proto; In light of [0], where we now allow dynptr only with CAP_BPF, I've moved this lower behind CAP_BPF check while applying. Thanks! [0] https://patchwork.kernel.org/project/netdevbpf/patch/20220921143550.30247-1-memxor@xxxxxxxxx/ > default: > break; > } [...]