On Thu, Oct 20, 2022 at 08:26:44AM IST, Alexei Starovoitov wrote: > On Wed, Oct 19, 2022 at 7:40 PM Kumar Kartikeya Dwivedi > <memxor@xxxxxxxxx> wrote: > > > > On Thu, Oct 20, 2022 at 07:43:16AM IST, Alexei Starovoitov wrote: > > > On Wed, Oct 19, 2022 at 6:04 PM Kumar Kartikeya Dwivedi > > > <memxor@xxxxxxxxx> wrote: > > > > > > > > On Thu, Oct 20, 2022 at 12:22:56AM IST, Alexei Starovoitov wrote: > > > > > On Tue, Oct 18, 2022 at 6:59 AM Kumar Kartikeya Dwivedi > > > > > <memxor@xxxxxxxxx> wrote: > > > > > > > > > > > > Currently, the dynptr function is not checking the variable offset part > > > > > > of PTR_TO_STACK that it needs to check. The fixed offset is considered > > > > > > when computing the stack pointer index, but if the variable offset was > > > > > > not a constant (such that it could not be accumulated in reg->off), we > > > > > > will end up a discrepency where runtime pointer does not point to the > > > > > > actual stack slot we mark as STACK_DYNPTR. > > > > > > > > > > > > It is impossible to precisely track dynptr state when variable offset is > > > > > > not constant, hence, just like bpf_timer, kptr, bpf_spin_lock, etc. > > > > > > simply reject the case where reg->var_off is not constant. Then, > > > > > > consider both reg->off and reg->var_off.value when computing the stack > > > > > > pointer index. > > > > > > > > > > > > A new helper dynptr_get_spi is introduced to hide over these details > > > > > > since the dynptr needs to be located in multiple places outside the > > > > > > process_dynptr_func checks, hence once we know it's a PTR_TO_STACK, we > > > > > > need to enforce these checks in all places. > > > > > > > > > > > > Note that it is disallowed for unprivileged users to have a non-constant > > > > > > var_off, so this problem should only be possible to trigger from > > > > > > programs having CAP_PERFMON. However, its effects can vary. > > > > > > > > > > > > Without the fix, it is possible to replace the contents of the dynptr > > > > > > arbitrarily by making verifier mark different stack slots than actual > > > > > > location and then doing writes to the actual stack address of dynptr at > > > > > > runtime. > > > > > > > > > > > > Fixes: 97e03f521050 ("bpf: Add verifier support for dynptrs") > > > > > > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@xxxxxxxxx> > > > > > > --- > > > > > > kernel/bpf/verifier.c | 80 +++++++++++++++---- > > > > > > .../testing/selftests/bpf/prog_tests/dynptr.c | 6 +- > > > > > > .../bpf/prog_tests/kfunc_dynptr_param.c | 2 +- > > > > > > 3 files changed, 67 insertions(+), 21 deletions(-) > > > > > > > > > > > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > > > > > > index 8f667180f70f..0fd73f96c5e2 100644 > > > > > > --- a/kernel/bpf/verifier.c > > > > > > +++ b/kernel/bpf/verifier.c > > > > > > @@ -610,11 +610,34 @@ static void print_liveness(struct bpf_verifier_env *env, > > > > > > verbose(env, "D"); > > > > > > } > > > > > > > > > > > > -static int get_spi(s32 off) > > > > > > +static int __get_spi(s32 off) > > > > > > { > > > > > > return (-off - 1) / BPF_REG_SIZE; > > > > > > } > > > > > > > > > > > > +static int dynptr_get_spi(struct bpf_verifier_env *env, struct bpf_reg_state *reg) > > > > > > +{ > > > > > > + int spi; > > > > > > + > > > > > > + if (reg->off % BPF_REG_SIZE) { > > > > > > + verbose(env, "cannot pass in dynptr at an offset=%d\n", reg->off); > > > > > > + return -EINVAL; > > > > > > + } > > > > > > > > > > I think this cannot happen. > > > > > > > > > > > > > There are existing selftests that trigger this. > > > > > > Really. Which one is that? > > > Those that you've modified in this patch are hitting > > > "cannot pass in dynptr..." message from the check below, no? > > > > > > > Just taking one example, invalid_read2 which does: > > > > bpf_dynptr_read(read_data, sizeof(read_data), (void *)&ptr + 1, 0, 0); > > > > does hit this one, it passes fp-15, no var_off. > > > > Same with invalid_helper2 that was updated. > > Same with invalid_offset that was updated. > > invalid_write3 gained coverage from this patch, earlier it was probably just > > being rejected because of arg_type_is_release checking spilled_ptr.id. > > not_valid_dynptr is also hitting this one, not the one below. > > > > The others now started hitting this error as the order of checks was changed in > > the verifier. Since arg_type_is_release checking happens before > > process_dynptr_func, it uses dynptr_get_spi to check ref_obj_id of spilled_ptr. > > At that point no checks have been made of the dynptr argument, so dynptr_get_spi > > is required to ensure spi is in bounds. > > > > The reg->off % BPF_REG_SIZE was earlier in check_func_arg_reg_off but that alone > > is not sufficient. This is why I wrapped everything into dynptr_get_spi. > > I see. That was not obvious at all that some other patch > is removing that check from check_func_arg_reg_off. > It is done in patch 4. There I move that check from the check_func_arg_reg_off to process_dynptr_func. > Why is the check there not sufficient? > I wanted to keep check_func_arg_reg_off free of assumptions for helper specific checks. It just ensures a few rules: When OBJ_RELEASE, offsets (fixed and var are 0) Otherwise, for some specific register types, allow fixed and var_off. For PTR_TO_BTF_ID, allow fixed but not var_off. Reject any fixed or var_off for all other cases. Everything else is handled on top of that. > > > > Or do you mean it cannot happen anymore? If so, why? > > > > > > Why would it? There is an alignment check earlier. > > > > > > > I removed the one in check_func_arg_reg_off. So this is the only place now where > > this alignment check happens. > > > > > > > > + if (!tnum_is_const(reg->var_off)) { > > > > > > + verbose(env, "dynptr has to be at the constant offset\n"); > > > > > > + return -EINVAL; > > > > > > + } > > > > > > > > > > This part can. > > > > > > > > > > > + spi = __get_spi(reg->off + reg->var_off.value); > > > > > > + if (spi < 1) { > > > > > > + verbose(env, "cannot pass in dynptr at an offset=%d\n", > > > > > > + (int)(reg->off + reg->var_off.value)); > > > > > > + return -EINVAL; > > > > > > + } > > > > > > + return spi; > > > > > > +} > > > > > > > > > > This one is a more conservative (read: redundant) check. > > > > > The is_spi_bounds_valid() is doing it better. > > > > > > > > The problem is, is_spi_bounds_valid returning an error is not always a problem. > > > > See how in is_dynptr_reg_valid_uninit we just return true on invalid bounds, > > > > then later simulate two 8-byte accesses for uninit_dynptr_regno and rely on it > > > > to grow the stack depth and do MAX_BPF_STACK check. > > > > > > It's a weird one. I'm not sure it's actually correct to do it this way. > > > > > > > Yeah, when looking at this I was actually surprised by that return true, > > thinking that was by accident and the stack depth was not being updated, but it > > later happens using check_mem_access in that if block. > > > > I'm open to other ideas, like separating out code in > > check_stack_write_fixed_off, but the only issue is code divergence and we miss > > checks we need to in both places due to duplication. Let me know what you think. > > Not following. Why check_stack_write_fixed_off has to do with any of that? > Well, I thought you didn't consider check_mem_access based simulation of writes to grow stack bounds to be clean, so I was soliciting opinions on how it could be done otherwise. It ends up calling check_stack_write_fixed_off internally. per > > > It's a weird one. I'm not sure it's actually correct to do it this way. but maybe I misunderstood and you meant it for is_spi_bounds_valid only. > The bug you're fixing is missing tnum_is_const(reg->var_off), right? > All other changes make it hard to understand what is going on. > In this patch, there is no other change. Every site that used get_spi(reg->off) now uses get_spi(reg->off + reg->var_off.value) essentially. For dynptr, only spi 1 and above are valid values. The main ugliness comes because it needs to get ref_obj_id earlier before argument processing begins in arg_type_is_release block. Maybe that step should be moved later below, I don't see anything using meta->ref_obj_id inside functions called by the switch case. Also, going back to what you said earlier: > If we only have get_spi_and_check() we'd have to add > WARN_ON_ONCE in a few places and that bothers me... > due to defensive programming... > If code is so complex that we cannot think it through > we have to refactor it. Sprinkling WARN_ON_ONCE (just to be sure) > doesn't inspire confidence. > Once we are done with process_dynptr_func, the rest of code can assume it points to a valid stack location where dynptr needs to be marked/unmarked, so the rest of the code doesn't do any checking of the spi etc.