On Tue, Mar 7, 2023 at 9:35 AM Kumar Kartikeya Dwivedi <memxor@xxxxxxxxx> wrote: > > On Tue, Mar 07, 2023 at 04:45:14PM CET, Alexei Starovoitov wrote: > > On Tue, Mar 7, 2023 at 2:22 AM Kumar Kartikeya Dwivedi <memxor@xxxxxxxxx> wrote: > > > > > > On Tue, Mar 07, 2023 at 03:23:25AM CET, Alexei Starovoitov wrote: > > > > On Sun, Mar 5, 2023 at 11:10 PM Kumar Kartikeya Dwivedi > > > > <memxor@xxxxxxxxx> wrote: > > > > > > > > > > On Wed, Mar 01, 2023 at 04:49:52PM CET, Joanne Koong wrote: > > > > > > Two new kfuncs are added, bpf_dynptr_slice and bpf_dynptr_slice_rdwr. > > > > > > The user must pass in a buffer to store the contents of the data slice > > > > > > if a direct pointer to the data cannot be obtained. > > > > > > > > > > > > For skb and xdp type dynptrs, these two APIs are the only way to obtain > > > > > > a data slice. However, for other types of dynptrs, there is no > > > > > > difference between bpf_dynptr_slice(_rdwr) and bpf_dynptr_data. > > > > > > > > > > > > For skb type dynptrs, the data is copied into the user provided buffer > > > > > > if any of the data is not in the linear portion of the skb. For xdp type > > > > > > dynptrs, the data is copied into the user provided buffer if the data is > > > > > > between xdp frags. > > > > > > > > > > > > If the skb is cloned and a call to bpf_dynptr_data_rdwr is made, then > > > > > > the skb will be uncloned (see bpf_unclone_prologue()). > > > > > > > > > > > > Please note that any bpf_dynptr_write() automatically invalidates any prior > > > > > > data slices of the skb dynptr. This is because the skb may be cloned or > > > > > > may need to pull its paged buffer into the head. As such, any > > > > > > bpf_dynptr_write() will automatically have its prior data slices > > > > > > invalidated, even if the write is to data in the skb head of an uncloned > > > > > > skb. Please note as well that any other helper calls that change the > > > > > > underlying packet buffer (eg bpf_skb_pull_data()) invalidates any data > > > > > > slices of the skb dynptr as well, for the same reasons. > > > > > > > > > > > > Signed-off-by: Joanne Koong <joannelkoong@xxxxxxxxx> > > > > > > --- > > > > > > > > > > Sorry for chiming in late. > > > > > > > > > > I see one potential hole in bpf_dynptr_slice_rdwr. If the returned pointer is > > > > > actually pointing to the stack (but verified as a PTR_TO_MEM in verifier state), > > > > > we won't reflect changes to the stack state in the verifier for writes happening > > > > > through it. > > > > > > > > > > For the worst case scenario, this will basically allow overwriting values of > > > > > spilled pointers and doing arbitrary kernel memory reads/writes. This is only an > > > > > issue when bpf_dynptr_slice_rdwr at runtime returns a pointer to the supplied > > > > > buffer residing on program stack. To verify, by forcing the memcpy to buffer for > > > > > skb_header_pointer I was able to make it dereference a garbage value for > > > > > l4lb_all selftest. > > > > > > > > > > --- a/kernel/bpf/helpers.c > > > > > +++ b/kernel/bpf/helpers.c > > > > > @@ -2253,7 +2253,13 @@ __bpf_kfunc void *bpf_dynptr_slice(const struct bpf_dynptr_kern *ptr, u32 offset > > > > > case BPF_DYNPTR_TYPE_RINGBUF: > > > > > return ptr->data + ptr->offset + offset; > > > > > case BPF_DYNPTR_TYPE_SKB: > > > > > - return skb_header_pointer(ptr->data, ptr->offset + offset, len, buffer); > > > > > + { > > > > > + void *p = skb_header_pointer(ptr->data, ptr->offset + offset, len, buffer); > > > > > + if (p == buffer) > > > > > + return p; > > > > > + memcpy(buffer, p, len); > > > > > + return buffer; > > > > > + } > > > > > > > > > > --- a/tools/testing/selftests/bpf/progs/test_l4lb_noinline_dynptr.c > > > > > +++ b/tools/testing/selftests/bpf/progs/test_l4lb_noinline_dynptr.c > > > > > @@ -470,7 +470,10 @@ int balancer_ingress(struct __sk_buff *ctx) > > > > > eth = bpf_dynptr_slice_rdwr(&ptr, 0, buffer, sizeof(buffer)); > > > > > if (!eth) > > > > > return TC_ACT_SHOT; > > > > > - eth_proto = eth->eth_proto; > > > > > + *(void **)buffer = ctx; > > > > > > > > Great catch. > > > > To fix the issue I think we should simply disallow such > > > > stack abuse. The compiler won't be spilling registers > > > > into C array on the stack. > > > > This manual spill/fill is exploiting verifier logic. > > > > After bpf_dynptr_slice_rdwr() we can mark all slots of the > > > > buffer as STACK_POISON or some better name and > > > > reject spill into such slots. > > > > > > > > > > I agree this is simpler, but I'm not sure it will work properly. Verifier won't > > > know when the lifetime of the buffer ends, so if we disallow spills until its > > > written over it's going to be a pain for users. > > > > > > Something like: > > > > > > for (...) { > > > char buf[64]; > > > bpf_dynptr_slice_rdwr(..., buf, 64); > > > ... > > > } > > > > > > .. and then compiler decides to spill something where buf was located on stack > > > outside the for loop. The verifier can't know when buf goes out of scope to > > > unpoison the slots. > > > > You're saying the "verifier doesn't know when buf ...". > > The same applies to the compiler. It has no visibility > > into what bpf_dynptr_slice_rdwr is doing. > > That is true, it can't assume anything about the side effects. But I am talking > about the point in the program when the buffer object no longer lives. Use of > the escaped pointer to such an object any longer is UB. The compiler is well > within its rights to reuse its stack storage at that point, including for > spilling registers. Which is why "outside the for loop" in my earlier reply. > > > So it never spills into a declared C array > > as I tried to explain in the previous reply. > > Spill/fill slots are always invisible to C. > > (unless of course you do pointer arithmetic asm style) > > When the declared array's lifetime ends, it can. > https://godbolt.org/z/Ez7v4xfnv > > The 2nd call to bar as part of unrolled loop happens with fp-8, then it calls > baz, spills r0 to fp-8, and calls bar again with fp-8. > > If such a stack slot is STACK_POISON, verifier will reject this program. > > > > > > > > + *(void **)eth = (void *)0xdeadbeef; > > > > > + ctx = *(void **)buffer; > > > > > + eth_proto = eth->eth_proto + ctx->len; > > > > > if (eth_proto == bpf_htons(ETH_P_IP)) > > > > > err = process_packet(&ptr, eth, nh_off, false, ctx); > > > > > > > > > > I think the proper fix is to treat it as a separate return type distinct from > > > > > PTR_TO_MEM like PTR_TO_MEM_OR_PKT (or handle PTR_TO_MEM | DYNPTR_* specially), > > > > > fork verifier state whenever there is a write, so that one path verifies it as > > > > > PTR_TO_PACKET, while another as PTR_TO_STACK (if buffer was a stack ptr). I > > > > > think for the rest it's not a problem, but there are allow_ptr_leak checks > > > > > applied to PTR_TO_STACK and PTR_TO_MAP_VALUE, so that needs to be rechecked. > > > > > Then we ensure that program is safe in either path. > > > > > > > > > > Also we need to fix regsafe to not consider other PTR_TO_MEMs equivalent to such > > > > > a pointer. We could also fork verifier states on return, to verify either path > > > > > separately right from the point following the call instruction. > > > > > > > > This is too complex imo. > > > > > > A better way to phrase this is to verify with R0 = PTR_TO_PACKET in one path, > > > and push_stack with R0 = buffer's reg->type + size set to len in the other path > > > for exploration later. In terms of verifier infra everything is there already, > > > it just needs to analyze both cases which fall into the regular code handling > > > the reg->type's. Probably then no adjustments to regsafe are needed either. It's > > > like exploring branch instructions. > > > > I still don't like it. There is no reason to go a complex path > > when much simpler suffices. This issue you are discussing is the reason we don't support bpf_dynptr_from_mem() taking PTR_TO_STACK (which is a pity, but we postponed it initially). I've been thinking about something along the lines of STACK_POISON, but remembering associated id/ref_obj_id. When ref is released, turn STACK_POISON to STACK_MISC. If it's bpf_dynptr_slice_rdrw() or bpf_dynptr_from_mem(), which don't have ref_obj_id, they still have ID associated with returned pointer, so can we somehow incorporate that?