On Fri, Jan 13, 2023 at 4:10 PM Eduard Zingerman <eddyz87@xxxxxxxxx> wrote: > > On Fri, 2023-01-13 at 14:22 -0800, Andrii Nakryiko wrote: > > On Fri, Jan 13, 2023 at 12:02 PM Eduard Zingerman <eddyz87@xxxxxxxxx> wrote: > > > > > > On Wed, 2023-01-11 at 16:24 -0800, Andrii Nakryiko wrote: > > > [...] > > > > > > > > I'm wondering if we should consider allowing uninitialized > > > > (STACK_INVALID) reads from stack, in general. It feels like it's > > > > causing more issues than is actually helpful in practice. Common code > > > > pattern is to __builtin_memset() some struct first, and only then > > > > initialize it, basically doing unnecessary work of zeroing out. All > > > > just to avoid verifier to complain about some irrelevant padding not > > > > being initialized. I haven't thought about this much, but it feels > > > > that STACK_MISC (initialized, but unknown scalar value) is basically > > > > equivalent to STACK_INVALID for all intents and purposes. Thoughts? > > > > > > Do you have an example of the __builtin_memset() usage? > > > I tried passing partially initialized stack allocated structure to > > > bpf_map_update_elem() and bpf_probe_write_user() and verifier did not > > > complain. > > > > > > Regarding STACK_MISC vs STACK_INVALID, I think it's ok to replace > > > STACK_INVALID with STACK_MISC if we are talking about STX/LDX/ALU > > > instructions because after LDX you would get a full range register and > > > you can't do much with a full range value. However, if a structure > > > containing un-initialized fields (e.g. not just padding) is passed to > > > a helper or kfunc is it an error? > > > > if we are passing stack as a memory to helper/kfunc (which should be > > the only valid use case with STACK_MISC, right?), then I think we > > expect helper/kfunc to treat it as memory with unknowable contents. > > Not sure if I'm missing something, but MISC says it's some unknown > > value, and the only difference between INVALID and MISC is that MISC's > > value was written by program explicitly, while for INVALID that > > garbage value was there on the stack already (but still unknowable > > scalar), which effectively is the same thing. > > I looked through the places where STACK_INVALID is used, here is the list: > > - unmark_stack_slots_dynptr() > Destroy dynptr marks. Suppose STACK_INVALID is replaced by > STACK_MISC here, in this case a scalar read would be possible from > such slot, which in turn might lead to pointer leak. > Might be a problem? We are already talking to enable reading STACK_DYNPTR slots directly. So not a problem? > > - scrub_spilled_slot() > mark spill slot STACK_MISC if not STACK_INVALID > Called from: > - save_register_state() called from check_stack_write_fixed_off() > Would mark not all slots only for 32-bit writes. > - check_stack_write_fixed_off() for insns like `fp[-8] = <const>` to > destroy previous stack marks. > - check_stack_range_initialized() > here it always marks all 8 spi slots as STACK_MISC. > Looks like STACK_MISC instead of STACK_INVALID wouldn't make a > difference in these cases. > > - check_stack_write_fixed_off() > Mark insn as sanitize_stack_spill if pointer is spilled to a stack > slot that is marked STACK_INVALID. This one is a bit strange. > E.g. the program like this: > > ... > 42: fp[-8] = ptr > ... > > Will mark insn (42) as sanitize_stack_spill. > However, the program like this: > > ... > 21: fp[-8] = 22 ;; marks as STACK_MISC > ... > 42: fp[-8] = ptr > ... > > Won't mark insn (42) as sanitize_stack_spill, which seems strange. > > - stack_write_var_off() > If !env->allow_ptr_leaks only allow writes if slots are not > STACK_INVALID. I'm not sure I understand the intention. > > - clean_func_state() > STACK_INVALID is used to mark spi's that are not REG_LIVE_READ as > such that should not take part in the state comparison. However, > stacksafe() has REG_LIVE_READ check as well, so this marking might > be unnecessary. > > - stacksafe() > STACK_INVALID is used as a mark that some bytes of an spi are not > important in a state cached for state comparison. E.g. a slot in an > old state might be marked 'mmmm????' and 'mmmmmmmm' or 'mmmm0000' in > a new state. However other checks in stacksafe() would catch these > variations. > > The conclusion being that some pointer leakage checks might need > adjustment if STACK_INVALID is replaced by STACK_MISC. Just to be clear. My suggestion was to *treat* STACK_INVALID as equivalent to STACK_MISC in stacksafe(), not really replace all the uses of STACK_INVALID with STACK_MISC. And to be on the safe side, I'd do it only if env->allow_ptr_leaks, of course. > > > > > > > > > > Obviously, this is a completely separate change and issue from what > > > > you are addressing in this patch set. > > > > > > > > Awesome job on tracking this down and fixing it! For the patch set: > > > > > > Thank you for reviewing this issue with me. > > > > > > > > > > > Acked-by: Andrii Nakryiko <andrii@xxxxxxxxxx> > > > > > > > > > > > [...] >