This series implements the bare minimum support for basic BPF exceptions. This is a feature to allow programs to simply throw a valueless exception within a BPF program to abort its execution. Automatic cleanup of held resources and generation of landing pads to unwind program state will be done in the part 2 set. The exception state is part of the the current task's task_struct's exception_thrown array, which is a set of 4 boolean variables, representing the exception state for each kernel context (Task, SoftIRQ, HardIRQ, NMI). This allows a program interrupting the execution of another program to not corrupt its exception state when throwing. During program nesting, i.e. a program invoked in the same context while another program is active, an exception can never be in the 'thrown' state, as long as it clears its own exception state when exiting and returning to the kernel, the caller program will be fine. Hence, the conditions are that a program receives unset exception state on entry, and preserves the unset exception state on exit. Only BPF subprogs avoid resetting exception state on exit, so that they can propagate the modified exception state to their caller. This can be another caller subprog, the caller main prog, or the kernel. BPF helpers executing callbacks in the kernel are modified to catch and exit early when they detect a thrown exception from the callback. The program will then catch this exception and unwind. BPF helpers and kfuncs may also throw BPF exceptions, they simply declare their intent to throw using annotations (e.g. predicate function with list of throwing helpers, KF_THROW for kfuncs, etc). The program will automatically be instrumented to detect a thrown exception and unwind the stack. The following two kfuncs are introduced: // Throw a BPF exception, terminating execution of the program. void bpf_throw(void); // Set an exception handler which is invoked after the unwinding of the // stack is finished. The return value of the handler is the value // returned when an exception is thrown, otherwise by default it is 0. void bpf_set_exception_handler(int (*handler)(void)); Dedicated exception state in task_struct vs bpf_run_ctx was chosen primarily for the following reasons: - Synchronous callbacks semantically execute within programs and affect their state. Hence, exceptions thrown by them should be propagated when invoked from helpers and kfuncs. - For async callbacks that terminate execution safely using bpf_throw, this would require having a bpf_run_ctx by default (with same semantics as the current solution) or setup by the invoking context for each cb invocation (which breaks cb == func pointer semantics assumed currently within the kernel). - Avoid setting up bpf_run_ctx for program types (esp. XDP) that don't need it, and no changes are needed for programs that don't use exceptions. Whatever minor overhead there is, is only paid for when they are used. Restrictions can be imposed on the program to revisit the bpf_run_ctx solution (like forbidding callbacks from using bpf_throw). However, genericity of use of bpf_throw in all possible cases is given preference (so that bpf_assert which is the primary consumer isn't surprising to use when it isn't supported in certain contexts). The overhead of calling helpers to deal with exception state can be easily forgone by getting direct access to the current task pointer to touch exception state, or using a dedicated callee-saved register to hold it within the program (while relying on task_struct state across BPF-Kernel boundary). Verification ------------ We rely on the main verifier pass to decide how to rewrite the BPF program to handle thrown exceptions. The first step of verification in the main pass (do_check) is symbolic execution of the global subprograms. These subprograms are verified independently, and hence, they are not explored whenever there is a call to a global subprog. We first do 'can_throw' marking for each global subprogram by following its execution, detecting any bpf_throw calls made in any of the paths explored by the verifier. This however does not have full visibility into the thrown exceptions, and only attains markings for exceptions it can see to be thrown by static subprogs, helpers, kfuncs, etc. For instance: GF1: call GF2 exit GF2: call GF3 exit GF3: call bpf_throw If all of these are explored in order, only GF3 receives the can_throw marking. To remedy this, we do another pass and follow BPF_PSEUDO_CALL edges to global subprogs in the call graph of global subprogs, and propagate the direct throws in some global subprogs to other global subprogs. Each caller then receives can_throw marking and is marked for rewrite later. Now, all global functions are annotated correctly with the right can_throw markings, and any calls to them in the main subprog can also be marked for appropriate rewrites later. We now go through the main subprog, and since we have full visibility due to the verifier's path awareness into which paths may throw, we use this to selectively mark every such instruction which has throwing semantics (calls to throwing subprogs (global/static), calls to helper or kfuncs taking callbacks which throw), calls to kfuncs which throw, etc. If a certain static subprog only throws when called from a certain point in the program, and does not throw in the other, we avoid marking its other callsite as throwing. However, this is unlike global subprogs, where we do not explore them, hence cannot make this distinction. This also means that two calls to the same static subprog may be rewritten differently, and thus may or may not handle the exception. This becomes a problem if we allow extension programs to attach to such static subprogs. Usually, the use case is to replace global subprogs, and static subprogs are rejected right now. A test case is included to catch the case when things change and prompt appropriate checks in check_ext_prog (as all callsites are not prepared to handle exceptions), Optimizations ------------- Currently, exception handling code is generated inline. This was done to keep things simple for now, and since the generated code is typically constant for each type of call instruction. Generation of dedicated landing pads to unwind program stack and moving it outline to a separate 'invented' subprog or end of subprog has been split into the future set, where the BPF runtime needs to release resources present on the program stack. Another minor annoyance are the calls to bpf_get_exception to fetch exception state. This call needs to be made unconditionally regardless of whether an exception was thrown or not. It would be much more convenient to have a hidden callee-saved BPF register which holds the exception state, but I'm not using R12 for that (e.g. on x86) hurts some other use case. Then, checking exceptions thrown within the program becomes much more lightweight, and the use bpf_get_exception is only limited to exceptions thrown by helpers and kfuncs (around their call, and then the exception register can propagate it to callers). Callbacks invoked by the kernel synchronously will then set both exception register and task-local exception state. Known issues ------------ * Since bpf_throw is marked noreturn, the compiler sometimes may determine that a function always throws and emit the final instruction as a call to it without emitting an exit in the caller. This leads to an error where the verifier complains about final instruction not being a jump, exit, or bpf_throw call (which gets treated as an exit). This is unlikely to occur as bpf_throw wouldn't be used whenever the condition is already known at compile time, but I could see it when testing with always throwing subprogs and calling into them. * Just asm volatile ("call bpf_throw" :::) does not emit DATASEC .ksyms for bpf_throw, there needs to be explicit call in C for clang to emit the DATASEC info in BTF, leading to errors during compilation. Kumar Kartikeya Dwivedi (9): bpf: Fix kfunc callback handling bpf: Refactor and generalize optimize_bpf_loop bpf: Implement bpf_throw kfunc bpf: Handle throwing BPF callbacks in helpers and kfuncs bpf: Add pass to fixup global function throw information bpf: Add KF_THROW annotation for kfuncs bpf: Introduce bpf_set_exception_callback kfunc bpf: Introduce BPF assertion macros selftests/bpf: Add tests for BPF exceptions include/linux/bpf.h | 9 +- include/linux/bpf_verifier.h | 20 +- include/linux/btf.h | 1 + include/linux/sched.h | 1 + kernel/bpf/arraymap.c | 13 +- kernel/bpf/bpf_iter.c | 2 + kernel/bpf/hashtab.c | 4 +- kernel/bpf/helpers.c | 46 +- kernel/bpf/ringbuf.c | 4 + kernel/bpf/syscall.c | 10 + kernel/bpf/task_iter.c | 2 + kernel/bpf/trampoline.c | 4 +- kernel/bpf/verifier.c | 692 +++++++++++++++++- net/bpf/test_run.c | 12 + .../testing/selftests/bpf/bpf_experimental.h | 37 + .../selftests/bpf/prog_tests/exceptions.c | 240 ++++++ .../testing/selftests/bpf/progs/exceptions.c | 218 ++++++ .../selftests/bpf/progs/exceptions_ext.c | 42 ++ .../selftests/bpf/progs/exceptions_fail.c | 267 +++++++ 19 files changed, 1576 insertions(+), 48 deletions(-) create mode 100644 tools/testing/selftests/bpf/prog_tests/exceptions.c create mode 100644 tools/testing/selftests/bpf/progs/exceptions.c create mode 100644 tools/testing/selftests/bpf/progs/exceptions_ext.c create mode 100644 tools/testing/selftests/bpf/progs/exceptions_fail.c base-commit: d099f594ad5650e8d8232b5f31f5f90104e65def -- 2.40.0