On Wed, 2024-11-27 at 08:58 -0800, Kumar Kartikeya Dwivedi wrote: > Teach the verifier about IRQ-disabled sections through the introduction > of two new kfuncs, bpf_local_irq_save, to save IRQ state and disable > them, and bpf_local_irq_restore, to restore IRQ state and enable them > back again. > > For the purposes of tracking the saved IRQ state, the verifier is taught > about a new special object on the stack of type STACK_IRQ_FLAG. This is > a 8 byte value which saves the IRQ flags which are to be passed back to > the IRQ restore kfunc. > > Renumber the enums for REF_TYPE_* to simplify the check in > find_lock_state, filtering out non-lock types as they grow will become > cumbersome and is unecessary. > > To track a dynamic number of IRQ-disabled regions and their associated > saved states, a new resource type RES_TYPE_IRQ is introduced, which its > state management functions: acquire_irq_state and release_irq_state, > taking advantage of the refactoring and clean ups made in earlier > commits. > > One notable requirement of the kernel's IRQ save and restore API is that > they cannot happen out of order. For this purpose, when releasing reference > we keep track of the prev_id we saw with REF_TYPE_IRQ. Since reference > states are inserted in increasing order of the index, this is used to > remember the ordering of acquisitions of IRQ saved states, so that we > maintain a logical stack in acquisition order of resource identities, > and can enforce LIFO ordering when restoring IRQ state. The top of the > stack is maintained using bpf_verifier_state's active_irq_id. > > The logic to detect initialized and unitialized irq flag slots, marking > and unmarking is similar to how it's done for iterators. No additional > checks are needed in refsafe for REF_TYPE_IRQ, apart from the usual > check_id satisfiability check on the ref[i].id. We have to perform the > same check_ids check on state->active_irq_id as well. > > The kfuncs themselves are plain wrappers over local_irq_save and > local_irq_restore macros. > > Acked-by: Eduard Zingerman <eddyz87@xxxxxxxxx> > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@xxxxxxxxx> Sorry, two more nits below. [...] > +static int unmark_stack_slot_irq_flag(struct bpf_verifier_env *env, struct bpf_reg_state *reg) > +{ > + struct bpf_func_state *state = func(env, reg); > + struct bpf_stack_state *slot; > + struct bpf_reg_state *st; > + int spi, i, err; > + > + spi = irq_flag_get_spi(env, reg); > + if (spi < 0) > + return spi; > + > + slot = &state->stack[spi]; > + st = &slot->spilled_ptr; > + > + err = release_irq_state(env->cur_state, st->ref_obj_id); > + WARN_ON_ONCE(err && err != -EACCES); > + if (err) { > + verbose(env, "cannot restore irq state out of order\n"); Nit: maybe also print acquire_irq_id and an instruction where it was acquired? > + return err; > + } > + > + __mark_reg_not_init(env, st); > + > + /* see unmark_stack_slots_dynptr() for why we need to set REG_LIVE_WRITTEN */ > + st->live |= REG_LIVE_WRITTEN; > + > + for (i = 0; i < BPF_REG_SIZE; i++) > + slot->slot_type[i] = STACK_INVALID; > + > + mark_stack_slot_scratched(env, spi); > + return 0; > +} > + > +static bool is_irq_flag_reg_valid_uninit(struct bpf_verifier_env *env, struct bpf_reg_state *reg) > +{ > + struct bpf_func_state *state = func(env, reg); > + struct bpf_stack_state *slot; > + int spi, i; > + > + /* For -ERANGE (i.e. spi not falling into allocated stack slots), we > + * will do check_mem_access to check and update stack bounds later, so > + * return true for that case. > + */ > + spi = irq_flag_get_spi(env, reg); > + if (spi == -ERANGE) > + return true; Nit: is it possible to swap is_irq_flag_reg_valid_uninit() and check_mem_access(), so that ERANGE special case would be not needed? > + if (spi < 0) > + return false; > + > + slot = &state->stack[spi]; > + > + for (i = 0; i < BPF_REG_SIZE; i++) > + if (slot->slot_type[i] == STACK_IRQ_FLAG) > + return false; > + return true; > +} [...]