On Tue, Nov 21, 2023 at 12:30 PM Andrei Matei <andreimatei1@xxxxxxxxx> wrote: > > On Tue, Nov 21, 2023 at 12:16 PM Hao Sun <sunhao.th@xxxxxxxxx> wrote: > > > > On Tue, Nov 21, 2023 at 1:46 AM Andrii Nakryiko > > <andrii.nakryiko@xxxxxxxxx> wrote: > > > > > > > > > It *feels* like this stack depth update *and* growing allocated stack > > > slots should happen somewhere in check_stack_access_within_bounds() or > > > right after it. It shouldn't matter whether we read or write to the > > > stack slot: either way that slot becomes part of the verifier state > > > that we should take into account during state comparison. Eduard not > > > so long ago added a change that allows reading STACK_INVALID slots, so > > > it's completely valid to read something that was never written to (and > > > so grow_stack_state() wasn't called for that slot, as it is > > > implemented right now). So I think we should fix that. > > > > > > > Agree. The following cases are currently confusing to me. > > Like Hao, I'm also confused about when reads from unitialized stack slots are > supposed to be allowed and when they're not. For example, the "uninitialized > stack1" test ([0]) seems to check that at least some type of uninitialized read > is not permitted. The reason why the respective access is rejected and this > test currently passes is the following check in check_stack_range_initialized: > [1]. We summarily reject the access to slots beyond state->allocated_stack. If > we were to not reject them there, and instead treat the slots as STACK_INVALID, > then the test's access would be allowed just below, through the combination of > env->allow_uninit_stack and `clobber`. And that seems like a consistent and sane behavior if bpf_allow_uninit_stack() is true, so I vote for fixing the test and make the logic consistent. > > So, assuming that this tests (and a few others) are sane, Andrii's suggestion > of calling grow_stack_state()/update_stack_depth() in > check_stack_access_within_bounds() does not immediately work: doing so > would change > the behavior in check_stack_range_initialized() and allow the access. > > On the other hand, perhaps the test is not sane and the access should be > permitted, in the spirit of allowing reads of uninitialized stack? Perhaps the > different treatment of slots beyond state->allocated_stack and STACK_INVALID yes, I think this divergence is not intentional, but maybe Eduard remembers some other quirks and why it is what it is, let's see. > slots is not intentional. Though, I am fairly confused about the idea of > allowing such reads in general - don't they allow leaking of kernel memory from > the uninit stack to user space? Even if restricted to root, is the leak ok? This is guarded by bpf_allow_uninit_stack(), which check CAP_PERFMON. Once BPF program has CAP_PERFMON, it can do bpf_probe_read_kernel() and generally can leak whatever kernel memory. So that's why STACK_INVALID is allowed to be read in such case. There is also bpf_allow_ptr_leaks() which also checks CAP_PERFMON. > > To also state the obvious - I'm happy to continue working on this patch and fix > a bug I've caused. However, if someone who actually knows the deal wants to > take over, I'm certainly not attached to anything :). It would be great if you can finish this, absolutely! Thanks! > > [0] https://github.com/torvalds/linux/blob/98b1cc82c4affc16f5598d4fa14b1858671b2263/tools/testing/selftests/bpf/progs/verifier_basic_stack.c#L28-L44 > [1] https://github.com/torvalds/linux/blob/98b1cc82c4affc16f5598d4fa14b1858671b2263/kernel/bpf/verifier.c#L7280 > > > > > > The verifier accepts the following program, which goes from #4 to #8 > > and directly read the stack at runtime without any previous write: > > func#0 @0 > > 0: R1=ctx() R10=fp0 > > 0: (bf) r6 = r10 ; R6_w=fp0 R10=fp0 > > 1: (85) call bpf_get_prandom_u32#7 ; R0_w=scalar() > > 2: (bf) r3 = r0 ; R0_w=scalar(id=1) R3_w=scalar(id=1) > > 3: (bf) r8 = r0 ; R0_w=scalar(id=1) R8_w=scalar(id=1) > > 4: (4e) if w0 & w3 goto pc+3 ; R0_w=scalar(id=1) R3_w=scalar(id=1) > > 5: (63) *(u32 *)(r6 -196) = r3 ; R3_w=scalar(id=1) R6_w=fp0 > > fp-200=mmmm???? > > 6: (18) r7 = 0x19 ; R7=25 > > 8: (61) r7 = *(u32 *)(r6 -200) ; R6=fp0 > > R7_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) > > fp-200=mmmm???? > > 9: (95) exit > > > > from 4 to 8: safe > > verification time 358 usec > > stack depth 200 > > processed 10 insns (limit 1000000) max_states_per_insn 0 total_states > > 1 peak_states 1 mark_read 1 > > > > The state is pruned, because of this: > > static bool stacksafe(...) > > .... > > if (env->allow_uninit_stack && > > old->stack[spi].slot_type[i % BPF_REG_SIZE] == STACK_MISC) > > continue; > > > > Yet, the sample direct read would be rejected: > > > > func#0 @0 > > 0: R1=ctx() R10=fp0 > > 0: (bf) r6 = r10 ; R6_w=fp0 R10=fp0 > > 1: (61) r7 = *(u32 *)(r6 -200) > > invalid read from stack R6 off=-200 size=4 > > > > Eduard, you added support for reading uninit slots, should we also add something > > like the following: > > > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > > index 8c2d31aa3d31..aa861d2da240 100644 > > --- a/kernel/bpf/verifier.c > > +++ b/kernel/bpf/verifier.c > > @@ -6446,7 +6446,7 @@ static int check_stack_slot_within_bounds(int off, > > { > > int min_valid_off; > > > > - if (t == BPF_WRITE) > > + if (t == BPF_WRITE || env->allow_uninit_stack) > > min_valid_off = -MAX_BPF_STACK; > > else > > min_valid_off = -state->allocated_stack; as I mentioned above, yes, I think the behavior should be consistent and such accesses should be allowed, just as if it was STACK_INVALID