On Sun, Oct 2, 2022 at 9:46 AM Kumar Kartikeya Dwivedi <memxor@xxxxxxxxx> wrote: > > On Thu, 29 Sept 2022 at 02:43, Andrii Nakryiko > <andrii.nakryiko@xxxxxxxxx> wrote: > > > > On Wed, Sep 28, 2022 at 5:32 PM Kumar Kartikeya Dwivedi > > <memxor@xxxxxxxxx> wrote: > > > > > > On Thu, 29 Sept 2022 at 01:04, Andrii Nakryiko > > > <andrii.nakryiko@xxxxxxxxx> wrote: > > > > > > > > On Wed, Sep 7, 2022 at 5:07 PM Joanne Koong <joannelkoong@xxxxxxxxx> wrote: > > > > > > > > > > Add a new helper function, bpf_dynptr_iterator: > > > > > > > > > > long bpf_dynptr_iterator(struct bpf_dynptr *ptr, void *callback_fn, > > > > > void *callback_ctx, u64 flags) > > > > > > > > > > where callback_fn is defined as: > > > > > > > > > > long (*callback_fn)(struct bpf_dynptr *ptr, void *ctx) > > > > > > > > > > and callback_fn returns the number of bytes to advance the > > > > > dynptr by (or an error code in the case of error). The iteration > > > > > will stop if the callback_fn returns 0 or an error or tries to > > > > > advance by more bytes than available. > > > > > > > > > > Signed-off-by: Joanne Koong <joannelkoong@xxxxxxxxx> > > > > > --- > > > > > include/uapi/linux/bpf.h | 20 ++++++++++++++ > > > > > kernel/bpf/helpers.c | 48 +++++++++++++++++++++++++++++++--- > > > > > kernel/bpf/verifier.c | 27 +++++++++++++++++++ > > > > > tools/include/uapi/linux/bpf.h | 20 ++++++++++++++ > > > > > 4 files changed, 111 insertions(+), 4 deletions(-) > > > > > > > > > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h > > > > > index 16973fa4d073..ff78a94c262a 100644 > > > > > --- a/include/uapi/linux/bpf.h > > > > > +++ b/include/uapi/linux/bpf.h > > > > > @@ -5531,6 +5531,25 @@ union bpf_attr { > > > > > * losing access to the original view of the dynptr. > > > > > * Return > > > > > * 0 on success, -EINVAL if the dynptr to clone is invalid. > > > > > + * > > > > > + * long bpf_dynptr_iterator(struct bpf_dynptr *ptr, void *callback_fn, void *callback_ctx, u64 flags) > > > > > + * Description > > > > > + * Iterate through the dynptr data, calling **callback_fn** on each > > > > > + * iteration with **callback_ctx** as the context parameter. > > > > > + * The **callback_fn** should be a static function and > > > > > + * the **callback_ctx** should be a pointer to the stack. > > > > > + * Currently **flags** is unused and must be 0. > > > > > + * > > > > > + * long (\*callback_fn)(struct bpf_dynptr \*ptr, void \*ctx); > > > > > + * > > > > > + * where **callback_fn** returns the number of bytes to advance > > > > > + * the dynptr by or an error. The iteration will stop if **callback_fn** > > > > > + * returns 0 or an error or tries to advance by more bytes than the > > > > > + * size of the dynptr. > > > > > + * Return > > > > > + * 0 on success, -EINVAL if the dynptr is invalid or **flags** is not 0, > > > > > + * -ERANGE if attempting to iterate more bytes than available, or other > > > > > + * negative error if **callback_fn** returns an error. > > > > > */ > > > > > #define __BPF_FUNC_MAPPER(FN) \ > > > > > FN(unspec), \ > > > > > @@ -5752,6 +5771,7 @@ union bpf_attr { > > > > > FN(dynptr_get_size), \ > > > > > FN(dynptr_get_offset), \ > > > > > FN(dynptr_clone), \ > > > > > + FN(dynptr_iterator), \ > > > > > /* */ > > > > > > > > > > /* integer value in 'imm' field of BPF_CALL instruction selects which helper > > > > > diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c > > > > > index 667f1e213a61..519b3da06d49 100644 > > > > > --- a/kernel/bpf/helpers.c > > > > > +++ b/kernel/bpf/helpers.c > > > > > @@ -1653,13 +1653,11 @@ static const struct bpf_func_proto bpf_dynptr_data_proto = { > > > > > .arg3_type = ARG_CONST_ALLOC_SIZE_OR_ZERO, > > > > > }; > > > > > > > > > > -BPF_CALL_2(bpf_dynptr_advance, struct bpf_dynptr_kern *, ptr, u32, len) > > > > > +/* *ptr* should always be a valid dynptr */ > > > > > +static int __bpf_dynptr_advance(struct bpf_dynptr_kern *ptr, u32 len) > > > > > { > > > > > u32 size; > > > > > > > > > > - if (!ptr->data) > > > > > - return -EINVAL; > > > > > - > > > > > size = __bpf_dynptr_get_size(ptr); > > > > > > > > > > if (len > size) > > > > > @@ -1672,6 +1670,14 @@ BPF_CALL_2(bpf_dynptr_advance, struct bpf_dynptr_kern *, ptr, u32, len) > > > > > return 0; > > > > > } > > > > > > > > > > +BPF_CALL_2(bpf_dynptr_advance, struct bpf_dynptr_kern *, ptr, u32, len) > > > > > +{ > > > > > + if (!ptr->data) > > > > > + return -EINVAL; > > > > > + > > > > > + return __bpf_dynptr_advance(ptr, len); > > > > > +} > > > > > + > > > > > static const struct bpf_func_proto bpf_dynptr_advance_proto = { > > > > > .func = bpf_dynptr_advance, > > > > > .gpl_only = false, > > > > > @@ -1783,6 +1789,38 @@ static const struct bpf_func_proto bpf_dynptr_clone_proto = { > > > > > .arg2_type = ARG_PTR_TO_DYNPTR | MEM_UNINIT, > > > > > }; > > > > > > > > > > +BPF_CALL_4(bpf_dynptr_iterator, struct bpf_dynptr_kern *, ptr, void *, callback_fn, > > > > > + void *, callback_ctx, u64, flags) > > > > > +{ > > > > > + bpf_callback_t callback = (bpf_callback_t)callback_fn; > > > > > + int nr_bytes, err; > > > > > + > > > > > + if (!ptr->data || flags) > > > > > + return -EINVAL; > > > > > + > > > > > + while (ptr->size > 0) { > > > > > + nr_bytes = callback((u64)(long)ptr, (u64)(long)callback_ctx, 0, 0, 0); > > > > > + if (nr_bytes <= 0) > > > > > + return nr_bytes; > > > > > > > > callback is defined as returning long but here you are silently > > > > truncating it to int. Let's either stick to longs or to ints. > > > > > > > > > + > > > > > + err = __bpf_dynptr_advance(ptr, nr_bytes); > > > > > > > > as Kumar pointed out, you can't modify ptr in place, you have to > > > > create a local copy and bpf_dynptr_clone() it (so that for MALLOC > > > > you'll bump refcnt). Then advance and pass it to callback. David has > > > > such local dynptr use case in bpf_user_ringbuf_drain() helper. > > > > > > > > > > My solution was a bit overcomplicated because just creating a local > > > copy doesn't fix this completely, there's still the hole of writing > > > through arg#0 (which now does not reflect runtime state, as writes at > > > runtime go to clone while verifier thinks you touched stack slots), > > > and still allows constructing the infinite loop (because well, you can > > > overwrite dynptr through it). The 'side channel' of writing to dynptr > > > slots through callback_ctx is still there as well. > > > > > > Maybe the infinite loop _can_ be avoided if you clone inside each > > > iteration, that part isn't very clear. > > > > > > My reasoning was that when you iterate, the container is always > > > immutable (to prevent iterator invalidation) while mutability of > > > elements remains unaffected from that. Setting it in spilled_ptr > > > covers all bases (PTR_TO_STACK arg#0, access to it through > > > callback_ctx, etc.). > > > > > > But I totally agree with you that we should be working on a local copy > > > inside the helper and leave the original dynptr untouched. > > > I think then the first arg should not be PTR_TO_STACK but some other > > > pointer type. Maybe it should be its own register type, and > > > spilled_ptr should reflect the same register, which allows the dynptr > > > state that we track to be copied into the callback arg#0 easily. > > > > Right, something like what David Vernet did with his > > bpf_user_ringbuf_drain() helper that passes kernel-only (not > > PTR_TO_STACK) dynptr into callback. If that implementation has the > > same hole (being able to be modified), we should fix it the same way > > in both cases, by not allowing this for PTR_TO_DYNPTR (or whatever the > > new reg type is called). > > > > Sorry for the late response. > > Somehow I missed that series entirely. Yes, we should reuse that register type, > and it does seem like it needs to check for MEM_UNINIT to prevent > reinitialization. I'm rolling that fix into > my dynptr series that I'm sending next week, since it would lead to > lots of conflicts otherwise. > Then this set can use the same approach. Sounds good, thanks!