On Wed, 27 Nov 2024 at 16:33, Kumar Kartikeya Dwivedi <memxor@xxxxxxxxx> wrote: > > Teach the verifier about IRQ-disabled sections through the introduction > of two new kfuncs, bpf_local_irq_save, to save IRQ state and disable > them, and bpf_local_irq_restore, to restore IRQ state and enable them > back again. > > For the purposes of tracking the saved IRQ state, the verifier is taught > about a new special object on the stack of type STACK_IRQ_FLAG. This is > a 8 byte value which saves the IRQ flags which are to be passed back to > the IRQ restore kfunc. > > To track a dynamic number of IRQ-disabled regions and their associated > saved states, a new resource type RES_TYPE_IRQ is introduced, which its > state management functions: acquire_irq_state and release_irq_state, > taking advantage of the refactoring and clean ups made in earlier > commits. > > One notable requirement of the kernel's IRQ save and restore API is that > they cannot happen out of order. For this purpose, when releasing reference > we keep track of the prev_id we saw with REF_TYPE_IRQ. Since reference > states are inserted in increasing order of the index, this is used to > remember the ordering of acquisitions of IRQ saved states, so that we > maintain a logical stack in acquisition order of resource identities, > and can enforce LIFO ordering when restoring IRQ state. The top of the > stack is maintained using bpf_verifier_state's active_irq_id. > > The logic to detect initialized and unitialized irq flag slots, marking > and unmarking is similar to how it's done for iterators. No additional > checks are needed in refsafe for REF_TYPE_IRQ, apart from the usual > check_id satisfiability check on the ref[i].id. We have to perform the > same check_ids check on state->active_irq_id as well. > > The kfuncs themselves are plain wrappers over local_irq_save and > local_irq_restore macros. > > Acked-by: Eduard Zingerman <eddyz87@xxxxxxxxx> > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@xxxxxxxxx> > --- > include/linux/bpf_verifier.h | 9 +- > kernel/bpf/helpers.c | 17 +++ > kernel/bpf/log.c | 1 + > kernel/bpf/verifier.c | 279 ++++++++++++++++++++++++++++++++++- > 4 files changed, 303 insertions(+), 3 deletions(-) > > diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h > index af64b5415df8..81eebe449e6c 100644 > --- a/include/linux/bpf_verifier.h > +++ b/include/linux/bpf_verifier.h > @@ -233,6 +233,7 @@ enum bpf_stack_slot_type { > */ > STACK_DYNPTR, > STACK_ITER, > + STACK_IRQ_FLAG, > }; > > #define BPF_REG_SIZE 8 /* size of eBPF register in bytes */ > @@ -254,8 +255,11 @@ struct bpf_reference_state { > * default to pointer reference on zero initialization of a state. > */ > enum ref_state_type { > - REF_TYPE_PTR = 0, > - REF_TYPE_LOCK, > + REF_TYPE_PTR = 0, > + REF_TYPE_IRQ = (1 << 0), > + > + REF_TYPE_LOCK = (1 << 1), > + REF_TYPE_LOCK_MASK = REF_TYPE_LOCK, I'm thinking of reconsidering the bit above and below. When rebasing other patches on top of this series, I sort of realized it might be unnecessary to keep the mask, and just make REF_TYPE_PTR non-zero, and always do s->type & type (which is needed by later patches for spin locks). - if (s->type == REF_TYPE_PTR || s->type != type) > + if (!(s->type & REF_TYPE_LOCK_MASK) || s->type != type) > continue; > > if (s->id == id && s->ptr == ptr) > @@ -3236,6 +3395,16 @@ static int mark_iter_read(struct bpf_verifier_env *env, struct bpf_reg_state *re > return mark_stack_slot_obj_read(env, reg, spi, nr_slots); > } > > [...]