On Mon, Mar 13, 2023 at 7:41 AM Kumar Kartikeya Dwivedi <memxor@xxxxxxxxx> wrote: > > On Mon, Mar 13, 2023 at 07:31:03AM CET, Joanne Koong wrote: > > On Fri, Mar 10, 2023 at 1:55 PM Alexei Starovoitov > > <alexei.starovoitov@xxxxxxxxx> wrote: > > > > > > On Fri, Mar 10, 2023 at 1:30 PM Andrii Nakryiko > > > <andrii.nakryiko@xxxxxxxxx> wrote: > > > > > > > > On Fri, Mar 10, 2023 at 1:15 PM Alexei Starovoitov > > > > <alexei.starovoitov@xxxxxxxxx> wrote: > > > > > > > > > > On Tue, Mar 07, 2023 at 04:01:28PM -0800, Andrii Nakryiko wrote: > > > > > > > > > > > > > > > > > > I agree this is simpler, but I'm not sure it will work properly. Verifier won't > > > > > > > > > know when the lifetime of the buffer ends, so if we disallow spills until its > > > > > > > > > written over it's going to be a pain for users. > > > > > > > > > > > > > > > > > > Something like: > > > > > > > > > > > > > > > > > > for (...) { > > > > > > > > > char buf[64]; > > > > > > > > > bpf_dynptr_slice_rdwr(..., buf, 64); > > > > > > > > > ... > > > > > > > > > } > > > > > > > > > > > > > > > > > > .. and then compiler decides to spill something where buf was located on stack > > > > > > > > > outside the for loop. The verifier can't know when buf goes out of scope to > > > > > > > > > unpoison the slots. > > > > > > > > > > > > > > > > You're saying the "verifier doesn't know when buf ...". > > > > > > > > The same applies to the compiler. It has no visibility > > > > > > > > into what bpf_dynptr_slice_rdwr is doing. > > > > > > > > > > > > > > That is true, it can't assume anything about the side effects. But I am talking > > > > > > > about the point in the program when the buffer object no longer lives. Use of > > > > > > > the escaped pointer to such an object any longer is UB. The compiler is well > > > > > > > within its rights to reuse its stack storage at that point, including for > > > > > > > spilling registers. Which is why "outside the for loop" in my earlier reply. > > > > > > > > > > > > > > > So it never spills into a declared C array > > > > > > > > as I tried to explain in the previous reply. > > > > > > > > Spill/fill slots are always invisible to C. > > > > > > > > (unless of course you do pointer arithmetic asm style) > > > > > > > > > > > > > > When the declared array's lifetime ends, it can. > > > > > > > https://godbolt.org/z/Ez7v4xfnv > > > > > > > > > > > > > > The 2nd call to bar as part of unrolled loop happens with fp-8, then it calls > > > > > > > baz, spills r0 to fp-8, and calls bar again with fp-8. > > > > > > > > > > Right. If user writes such program and does explicit store of spillable > > > > > pointer into a stack. > > > > > I was talking about compiler generated spill/fill and I still believe > > > > > that compiler will not be reusing variable's stack memory for them. > > > > > > > > > > > > > > > > > > > If such a stack slot is STACK_POISON, verifier will reject this program. > > > > > > > > > > Yes and I think it's an ok trade-off. > > > > > The user has to specifically code such program to hit this issue. > > > > > I don't think we will see this in practice. > > > > > If we do we can consider a more complex fix. > > > > > > > > I was just debugging (a completely unrelated) issue where two > > > > completely independent functions, with different local variables, were > > > > reusing the same stack slots just because of them being inlined in > > > > parent functions. So stack reuse happens all the time, unfortunately. > > > > It's not always obvious or malicious. > > > > > > Right. Stack reuse happens for variables all the time. > > > I'm still arguing that compile internal spill/fill is coming > > > from different slots. > > > > > > When clang compiles the kernel it prints: > > > ../kernel/bpf/verifier.c:18017:5: warning: stack frame size (2296) > > > exceeds limit (2048) in 'bpf_check' [-Wframe-larger-than] > > > int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr) > > > ^ > > > 572/2296 (24.91%) spills, 1724/2296 (75.09%) variables > > > > > > spills and variables are different areas. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > + *(void **)eth = (void *)0xdeadbeef; > > > > > > > > > > > + ctx = *(void **)buffer; > > > > > > > > > > > + eth_proto = eth->eth_proto + ctx->len; > > > > > > > > > > > if (eth_proto == bpf_htons(ETH_P_IP)) > > > > > > > > > > > err = process_packet(&ptr, eth, nh_off, false, ctx); > > > > > > > > > > > > > > > > > > > > > > I think the proper fix is to treat it as a separate return type distinct from > > > > > > > > > > > PTR_TO_MEM like PTR_TO_MEM_OR_PKT (or handle PTR_TO_MEM | DYNPTR_* specially), > > > > > > > > > > > fork verifier state whenever there is a write, so that one path verifies it as > > > > > > > > > > > PTR_TO_PACKET, while another as PTR_TO_STACK (if buffer was a stack ptr). I > > > > > > > > > > > think for the rest it's not a problem, but there are allow_ptr_leak checks > > > > > > > > > > > applied to PTR_TO_STACK and PTR_TO_MAP_VALUE, so that needs to be rechecked. > > > > > > > > > > > Then we ensure that program is safe in either path. > > > > > > > > > > > > > > > > > > > > > > Also we need to fix regsafe to not consider other PTR_TO_MEMs equivalent to such > > > > > > > > > > > a pointer. We could also fork verifier states on return, to verify either path > > > > > > > > > > > separately right from the point following the call instruction. > > > > > > > > > > > > > > > > > > > > This is too complex imo. > > > > > > > > > > > > > > > > > > A better way to phrase this is to verify with R0 = PTR_TO_PACKET in one path, > > > > > > > > > and push_stack with R0 = buffer's reg->type + size set to len in the other path > > > > > > > > > for exploration later. In terms of verifier infra everything is there already, > > > > > > > > > it just needs to analyze both cases which fall into the regular code handling > > > > > > > > > the reg->type's. Probably then no adjustments to regsafe are needed either. It's > > > > > > > > > like exploring branch instructions. > > > > > > > > > > > > > > > > I still don't like it. There is no reason to go a complex path > > > > > > > > when much simpler suffices. > > > > > > > > > > > > This issue you are discussing is the reason we don't support > > > > > > bpf_dynptr_from_mem() taking PTR_TO_STACK (which is a pity, but we > > > > > > postponed it initially). > > > > > > > > > > > > I've been thinking about something along the lines of STACK_POISON, > > > > > > but remembering associated id/ref_obj_id. When ref is released, turn > > > > > > STACK_POISON to STACK_MISC. If it's bpf_dynptr_slice_rdrw() or > > > > > > bpf_dynptr_from_mem(), which don't have ref_obj_id, they still have ID > > > > > > associated with returned pointer, so can we somehow incorporate that? > > > > > > > > > > There is dynptr_id in PTR_TO_MEM that is used by destroy_if_dynptr_stack_slot(), > > > > > but I don't see how we can use it to help this case. > > > > > imo plain STACK_POISON that is overwriteable by STACK_MISC/STACK_ZERO > > > > > should be good enough in practice. > > > > > > > > That's basically what I'm proposing, except when this overwrite > > > > happens we have to go and invalidate all the PTR_TO_MEM references > > > > that are pointing to that stack slot. E.g., in the below case > > > > (assuming we allow LOCAL dynptr to be constructed from stack) > > > > > > > > char buf[256], *p; > > > > struct bpf_dynptr dptr; > > > > > > > > bpf_dynptr_from_mem(buf, buf+256, &dptr); > > > > > > > > p = bpf_dynptr_data(&dptr, 128, 16); /* get 16-byte slice into buf, at > > > > offset 128 */ > > > > > > > > /* buf[128] through buf[128+16] are STACK_POISON */ > > > > > > > > buf[128] = 123; > > > > > > > > So here is where the problem happens. Should we invalidate just p > > > > here? Or entire dptr? Haven't thought much about details, but > > > > something like that. It was getting messy when we started to think > > > > about this with Joanne. > > > > > > Let's table dynptr_from_mem for a second and solve > > > bpf_dynptr_slice_rdrw first, since I'm getting confused. > > > > > > For bpf_dynptr_slice_rdrw we can mark buffer[] in stack as > > > poisoned with dynptr_id == R0's PTR_TO_MEM dynptr_id. > > > Then as soon as first spillable reg touches that poisoned stack area > > > we can invalidate all PTR_TO_MEM's with that dynptr_id. > > > > Okay, this makes sense to me. are you already currently working or > > planning to work on a fix for this Kumar, or should i take a stab at > > it? > > I'm not planning to do so, so go ahead. One more thing I noticed just now is > that we probably need to update regsafe to perform a check_ids comparison for > dynptr_id for dynptr PTR_TO_MEMs? It was not a problem back when f8064ab90d66 > ("bpf: Invalidate slices on destruction of dynptrs on stack") was added but > 567da5d253cd ("bpf: improve regsafe() checks for PTR_TO_{MEM,BUF,TP_BUFFER}") > added PTR_TO_MEM in the switch statement. I can take care of this. But I really would like to avoid these special cases of extra dynptr_id, exactly for reasons like this omitted check. What do people think about generalizing current ref_obj_id to be more like "lifetime id" (to borrow Rust terminology a bit), which would be an object (which might or might not be a tracked reference) defining the scope/lifetime of the current register (whatever it represents). I haven't looked through code much, but I've been treating ref_obj_id as that already in my thinking before, and it seems to be a better approach than having a special-case of dynptr_id. Thoughts?