On Thu, 28 Nov 2024 at 05:31, Eduard Zingerman <eddyz87@xxxxxxxxx> wrote: > > On Wed, 2024-11-27 at 08:58 -0800, Kumar Kartikeya Dwivedi wrote: > > Teach the verifier about IRQ-disabled sections through the introduction > > of two new kfuncs, bpf_local_irq_save, to save IRQ state and disable > > them, and bpf_local_irq_restore, to restore IRQ state and enable them > > back again. > > > > For the purposes of tracking the saved IRQ state, the verifier is taught > > about a new special object on the stack of type STACK_IRQ_FLAG. This is > > a 8 byte value which saves the IRQ flags which are to be passed back to > > the IRQ restore kfunc. > > > > Renumber the enums for REF_TYPE_* to simplify the check in > > find_lock_state, filtering out non-lock types as they grow will become > > cumbersome and is unecessary. > > > > To track a dynamic number of IRQ-disabled regions and their associated > > saved states, a new resource type RES_TYPE_IRQ is introduced, which its > > state management functions: acquire_irq_state and release_irq_state, > > taking advantage of the refactoring and clean ups made in earlier > > commits. > > > > One notable requirement of the kernel's IRQ save and restore API is that > > they cannot happen out of order. For this purpose, when releasing reference > > we keep track of the prev_id we saw with REF_TYPE_IRQ. Since reference > > states are inserted in increasing order of the index, this is used to > > remember the ordering of acquisitions of IRQ saved states, so that we > > maintain a logical stack in acquisition order of resource identities, > > and can enforce LIFO ordering when restoring IRQ state. The top of the > > stack is maintained using bpf_verifier_state's active_irq_id. > > > > The logic to detect initialized and unitialized irq flag slots, marking > > and unmarking is similar to how it's done for iterators. No additional > > checks are needed in refsafe for REF_TYPE_IRQ, apart from the usual > > check_id satisfiability check on the ref[i].id. We have to perform the > > same check_ids check on state->active_irq_id as well. > > > > The kfuncs themselves are plain wrappers over local_irq_save and > > local_irq_restore macros. > > > > Acked-by: Eduard Zingerman <eddyz87@xxxxxxxxx> > > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@xxxxxxxxx> > > Sorry, two more nits below. > > [...] > > > +static int unmark_stack_slot_irq_flag(struct bpf_verifier_env *env, struct bpf_reg_state *reg) > > +{ > > + struct bpf_func_state *state = func(env, reg); > > + struct bpf_stack_state *slot; > > + struct bpf_reg_state *st; > > + int spi, i, err; > > + > > + spi = irq_flag_get_spi(env, reg); > > + if (spi < 0) > > + return spi; > > + > > + slot = &state->stack[spi]; > > + st = &slot->spilled_ptr; > > + > > + err = release_irq_state(env->cur_state, st->ref_obj_id); > > + WARN_ON_ONCE(err && err != -EACCES); > > + if (err) { > > + verbose(env, "cannot restore irq state out of order\n"); > > Nit: maybe also print acquire_irq_id and an instruction where it was acquired? Ack. For printing the insn_idx, I guess just search in the refs array? > > > + return err; > > + } > > + > > + __mark_reg_not_init(env, st); > > + > > + /* see unmark_stack_slots_dynptr() for why we need to set REG_LIVE_WRITTEN */ > > + st->live |= REG_LIVE_WRITTEN; > > + > > + for (i = 0; i < BPF_REG_SIZE; i++) > > + slot->slot_type[i] = STACK_INVALID; > > + > > + mark_stack_slot_scratched(env, spi); > > + return 0; > > +} > > + > > +static bool is_irq_flag_reg_valid_uninit(struct bpf_verifier_env *env, struct bpf_reg_state *reg) > > +{ > > + struct bpf_func_state *state = func(env, reg); > > + struct bpf_stack_state *slot; > > + int spi, i; > > + > > + /* For -ERANGE (i.e. spi not falling into allocated stack slots), we > > + * will do check_mem_access to check and update stack bounds later, so > > + * return true for that case. > > + */ > > + spi = irq_flag_get_spi(env, reg); > > + if (spi == -ERANGE) > > + return true; > > Nit: is it possible to swap is_irq_flag_reg_valid_uninit() and > check_mem_access(), so that ERANGE special case would be not needed? > I don't think so. For dynptr, iter, irq, ERANGE indicates stack needs to be grown, so check_mem_access will naturally do that when writing. When not ERANGE, we need to catch cases where we have a bad slot_type. If we overwrote it with check_mem_access, then it would scrub the slot type as well. When I fixed this stuff for dynptr, we had to additionally destroy_if_dynptr_stack_slot because it wasn't required to 'release' a dynptr when overwriting it. Andrii made sure this was necessary for iters so now slot_type == STACK_ITER is just rejected instead of overwrite without a destroy operation. Similar idea is followed for irq flag. Just paging in context for all this, but I may be missing if you have something in mind. > > + if (spi < 0) > > + return false; > > + > > + slot = &state->stack[spi]; > > + > > + for (i = 0; i < BPF_REG_SIZE; i++) > > + if (slot->slot_type[i] == STACK_IRQ_FLAG) > > + return false; > > + return true; > > +} > > [...] >