On Wed, Apr 05, 2023 at 02:42:33AM +0200, Kumar Kartikeya Dwivedi wrote: > > - The exception state is represented using four booleans in the > task_struct of current task. Each boolean corresponds to the exception > state for each kernel context. This allows BPF programs to be > interrupted and still not clobber the other's exception state. that doesn't work for sleepable bpf progs and in RT for regular progs too. > - The other vexing case is of recursion. If a program calls into another > program (e.g. call into helper which invokes tracing program > eventually), it may throw and clobber the current exception state. To > avoid this, an invariant is maintained across the implementation: > Exception state is always cleared on entry and exit of the main > BPF program. > This implies that if recursion occurs, the BPF program will clear the > current exception state on entry and exit. However, callbacks do not > do the same, because they are subprograms. The case for propagating > exceptions of callbacks invoked by the kernel back to the BPF program > is handled in the next commit. This is also the main reason to clear > exception state on entry, asynchronous callbacks can clobber exception > state even though we make sure it's always set to be 0 within the > kernel. > Anyhow, the only other thing to be kept in mind is to never allow a > BPF program to execute when the program is being unwinded. This > implies that every function involved in this path must be notrace, > which is the case for bpf_throw, bpf_get_exception and > bpf_reset_exception. ... > + struct bpf_insn entry_insns[] = { > + BPF_MOV64_REG(BPF_REG_6, BPF_REG_1), > + BPF_EMIT_CALL(bpf_reset_exception), > + BPF_MOV64_REG(BPF_REG_1, BPF_REG_6), > + insn[i], > + }; Is not correct in global bpf progs that take more than 1 argument. How about using a scratch space in prog->aux->exception[] instead of current task? > +notrace u64 bpf_get_exception(void) > +{ > + int i = interrupt_context_level(); > + > + return current->bpf_exception_thrown[i]; > +} this is too slow to be acceptable. it needs to be single load plus branch. with prog->aux->exception approach we can achieve that. Instead of inserting a call to bpf_get_exception() we can do load+cmp. We probably should pass prog->aux into exception callback, so it can know where throw came from. > - Rewrites happen for bpf_throw and call instructions to subprogs. > The instructions which are executed in the main frame of the main > program (thus, not global functions and extension programs, which end > up executing in frame > 0) need to be rewritten differently. This is > tracked using BPF_THROW_OUTER vs BPF_THROW_INNER. If not done, a how about BPF_THROW_OUTER vs BPF_THROW_ANY_INNER ? would it be more precise ? > +__bpf_kfunc notrace void bpf_throw(void) > +{ > + int i = interrupt_context_level(); > + > + current->bpf_exception_thrown[i] = true; > +} I think this needs to take u64 or couple u64 args and store them in the scratch area. bpf_assert* macros also need a way for bpf prog to specify the reason for the assertion. Otherwise there won't be any way to debug what happened.