On Fri, Jan 13, 2023 at 5:17 PM Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> wrote: > > On Fri, Jan 13, 2023 at 4:10 PM Eduard Zingerman <eddyz87@xxxxxxxxx> wrote: > > > > On Fri, 2023-01-13 at 14:22 -0800, Andrii Nakryiko wrote: > > > On Fri, Jan 13, 2023 at 12:02 PM Eduard Zingerman <eddyz87@xxxxxxxxx> wrote: > > > > > > > > On Wed, 2023-01-11 at 16:24 -0800, Andrii Nakryiko wrote: > > > > [...] > > > > > > > > > > I'm wondering if we should consider allowing uninitialized > > > > > (STACK_INVALID) reads from stack, in general. It feels like it's > > > > > causing more issues than is actually helpful in practice. Common code > > > > > pattern is to __builtin_memset() some struct first, and only then > > > > > initialize it, basically doing unnecessary work of zeroing out. All > > > > > just to avoid verifier to complain about some irrelevant padding not > > > > > being initialized. I haven't thought about this much, but it feels > > > > > that STACK_MISC (initialized, but unknown scalar value) is basically > > > > > equivalent to STACK_INVALID for all intents and purposes. Thoughts? > > > > > > > > Do you have an example of the __builtin_memset() usage? > > > > I tried passing partially initialized stack allocated structure to > > > > bpf_map_update_elem() and bpf_probe_write_user() and verifier did not > > > > complain. > > > > > > > > Regarding STACK_MISC vs STACK_INVALID, I think it's ok to replace > > > > STACK_INVALID with STACK_MISC if we are talking about STX/LDX/ALU > > > > instructions because after LDX you would get a full range register and > > > > you can't do much with a full range value. However, if a structure > > > > containing un-initialized fields (e.g. not just padding) is passed to > > > > a helper or kfunc is it an error? > > > > > > if we are passing stack as a memory to helper/kfunc (which should be > > > the only valid use case with STACK_MISC, right?), then I think we > > > expect helper/kfunc to treat it as memory with unknowable contents. > > > Not sure if I'm missing something, but MISC says it's some unknown > > > value, and the only difference between INVALID and MISC is that MISC's > > > value was written by program explicitly, while for INVALID that > > > garbage value was there on the stack already (but still unknowable > > > scalar), which effectively is the same thing. > > > > I looked through the places where STACK_INVALID is used, here is the list: > > > > - unmark_stack_slots_dynptr() > > Destroy dynptr marks. Suppose STACK_INVALID is replaced by > > STACK_MISC here, in this case a scalar read would be possible from > > such slot, which in turn might lead to pointer leak. > > Might be a problem? > > We are already talking to enable reading STACK_DYNPTR slots directly. > So not a problem? > > > > > - scrub_spilled_slot() > > mark spill slot STACK_MISC if not STACK_INVALID > > Called from: > > - save_register_state() called from check_stack_write_fixed_off() > > Would mark not all slots only for 32-bit writes. > > - check_stack_write_fixed_off() for insns like `fp[-8] = <const>` to > > destroy previous stack marks. > > - check_stack_range_initialized() > > here it always marks all 8 spi slots as STACK_MISC. > > Looks like STACK_MISC instead of STACK_INVALID wouldn't make a > > difference in these cases. > > > > - check_stack_write_fixed_off() > > Mark insn as sanitize_stack_spill if pointer is spilled to a stack > > slot that is marked STACK_INVALID. This one is a bit strange. > > E.g. the program like this: > > > > ... > > 42: fp[-8] = ptr > > ... > > > > Will mark insn (42) as sanitize_stack_spill. > > However, the program like this: > > > > ... > > 21: fp[-8] = 22 ;; marks as STACK_MISC > > ... > > 42: fp[-8] = ptr > > ... > > > > Won't mark insn (42) as sanitize_stack_spill, which seems strange. > > > > - stack_write_var_off() > > If !env->allow_ptr_leaks only allow writes if slots are not > > STACK_INVALID. I'm not sure I understand the intention. > > > > - clean_func_state() > > STACK_INVALID is used to mark spi's that are not REG_LIVE_READ as > > such that should not take part in the state comparison. However, > > stacksafe() has REG_LIVE_READ check as well, so this marking might > > be unnecessary. > > > > - stacksafe() > > STACK_INVALID is used as a mark that some bytes of an spi are not > > important in a state cached for state comparison. E.g. a slot in an > > old state might be marked 'mmmm????' and 'mmmmmmmm' or 'mmmm0000' in > > a new state. However other checks in stacksafe() would catch these > > variations. > > > > The conclusion being that some pointer leakage checks might need > > adjustment if STACK_INVALID is replaced by STACK_MISC. > > Just to be clear. My suggestion was to *treat* STACK_INVALID as > equivalent to STACK_MISC in stacksafe(), not really replace all the > uses of STACK_INVALID with STACK_MISC. And to be on the safe side, I'd > do it only if env->allow_ptr_leaks, of course. Well, that, and to allow STACK_INVALID if env->allow_ptr_leaks in check_stack_read_fixed_off(), of course, to avoid "invalid read from stack off %d+%d size %d\n" error (that's fixing at least part of the problem with uninitialized struct padding). > > > > > > > > > > > > > > > Obviously, this is a completely separate change and issue from what > > > > > you are addressing in this patch set. > > > > > > > > > > Awesome job on tracking this down and fixing it! For the patch set: > > > > > > > > Thank you for reviewing this issue with me. > > > > > > > > > > > > > > Acked-by: Andrii Nakryiko <andrii@xxxxxxxxxx> > > > > > > > > > > > > > > [...] > >