[PATCH RFC bpf-next v1 0/9] Exceptions - 1/2

Kumar Kartikeya Dwivedi <memxor@xxxxxxxxx> · Wed, 5 Apr 2023 02:42:30 +0200

This series implements the bare minimum support for basic BPF
exceptions. This is a feature to allow programs to simply throw a
valueless exception within a BPF program to abort its execution.
Automatic cleanup of held resources and generation of landing pads to
unwind program state will be done in the part 2 set.

The exception state is part of the the current task's task_struct's
exception_thrown array, which is a set of 4 boolean variables,
representing the exception state for each kernel context (Task, SoftIRQ,
HardIRQ, NMI). This allows a program interrupting the execution of
another program to not corrupt its exception state when throwing.

During program nesting, i.e. a program invoked in the same context while
another program is active, an exception can never be in the 'thrown'
state, as long as it clears its own exception state when exiting and
returning to the kernel, the caller program will be fine.

Hence, the conditions are that a program receives unset exception state
on entry, and preserves the unset exception state on exit. Only BPF
subprogs avoid resetting exception state on exit, so that they can
propagate the modified exception state to their caller. This can be
another caller subprog, the caller main prog, or the kernel.

BPF helpers executing callbacks in the kernel are modified to catch and
exit early when they detect a thrown exception from the callback. The
program will then catch this exception and unwind.

BPF helpers and kfuncs may also throw BPF exceptions, they simply
declare their intent to throw using annotations (e.g. predicate function
with list of throwing helpers, KF_THROW for kfuncs, etc). The program
will automatically be instrumented to detect a thrown exception and
unwind the stack.

The following two kfuncs are introduced:

// Throw a BPF exception, terminating execution of the program.
void bpf_throw(void);

// Set an exception handler which is invoked after the unwinding of the
// stack is finished. The return value of the handler is the value
// returned when an exception is thrown, otherwise by default it is 0.
void bpf_set_exception_handler(int (*handler)(void));

Dedicated exception state in task_struct vs bpf_run_ctx was chosen
primarily for the following reasons:
 - Synchronous callbacks semantically execute within programs and affect
   their state. Hence, exceptions thrown by them should be propagated
   when invoked from helpers and kfuncs.
 - For async callbacks that terminate execution safely using bpf_throw,
   this would require having a bpf_run_ctx by default (with same
   semantics as the current solution) or setup by the invoking context
   for each cb invocation (which breaks cb == func pointer semantics
   assumed currently within the kernel).
 - Avoid setting up bpf_run_ctx for program types (esp. XDP) that don't
   need it, and no changes are needed for programs that don't use
   exceptions. Whatever minor overhead there is, is only paid for when
   they are used.

Restrictions can be imposed on the program to revisit the bpf_run_ctx
solution (like forbidding callbacks from using bpf_throw). However,
genericity of use of bpf_throw in all possible cases is given preference
(so that bpf_assert which is the primary consumer isn't surprising to
use when it isn't supported in certain contexts).

The overhead of calling helpers to deal with exception state can be
easily forgone by getting direct access to the current task pointer to
touch exception state, or using a dedicated callee-saved register to
hold it within the program (while relying on task_struct state across
BPF-Kernel boundary).

Verification
------------

We rely on the main verifier pass to decide how to rewrite the BPF
program to handle thrown exceptions.

The first step of verification in the main pass (do_check) is symbolic
execution of the global subprograms. These subprograms are verified
independently, and hence, they are not explored whenever there is a call
to a global subprog.

We first do 'can_throw' marking for each global subprogram by following
its execution, detecting any bpf_throw calls made in any of the paths
explored by the verifier. This however does not have full visibility
into the thrown exceptions, and only attains markings for exceptions it
can see to be thrown by static subprogs, helpers, kfuncs, etc.

For instance:

GF1:
	call GF2
	exit
GF2:
	call GF3
	exit
GF3:
	call bpf_throw

If all of these are explored in order, only GF3 receives the can_throw
marking. To remedy this, we do another pass and follow BPF_PSEUDO_CALL
edges to global subprogs in the call graph of global subprogs, and
propagate the direct throws in some global subprogs to other global
subprogs. Each caller then receives can_throw marking and is marked for
rewrite later.

Now, all global functions are annotated correctly with the right
can_throw markings, and any calls to them in the main subprog can also
be marked for appropriate rewrites later. We now go through the main
subprog, and since we have full visibility due to the verifier's path
awareness into which paths may throw, we use this to selectively mark
every such instruction which has throwing semantics (calls to throwing
subprogs (global/static), calls to helper or kfuncs taking callbacks
which throw), calls to kfuncs which throw, etc.

If a certain static subprog only throws when called from a certain point
in the program, and does not throw in the other, we avoid marking its
other callsite as throwing. However, this is unlike global subprogs,
where we do not explore them, hence cannot make this distinction. This
also means that two calls to the same static subprog may be rewritten
differently, and thus may or may not handle the exception.

This becomes a problem if we allow extension programs to attach to such
static subprogs. Usually, the use case is to replace global subprogs,
and static subprogs are rejected right now. A test case is included to
catch the case when things change and prompt appropriate checks in
check_ext_prog (as all callsites are not prepared to handle exceptions),

Optimizations
-------------

Currently, exception handling code is generated inline. This was done to
keep things simple for now, and since the generated code is typically
constant for each type of call instruction.

Generation of dedicated landing pads to unwind program stack and moving
it outline to a separate 'invented' subprog or end of subprog has been
split into the future set, where the BPF runtime needs to release
resources present on the program stack.

Another minor annoyance are the calls to bpf_get_exception to fetch
exception state. This call needs to be made unconditionally regardless
of whether an exception was thrown or not. It would be much more
convenient to have a hidden callee-saved BPF register which holds the
exception state, but I'm not using R12 for that (e.g. on x86) hurts some
other use case.

Then, checking exceptions thrown within the program becomes much more
lightweight, and the use bpf_get_exception is only limited to exceptions
thrown by helpers and kfuncs (around their call, and then the exception
register can propagate it to callers).

Callbacks invoked by the kernel synchronously will then set both
exception register and task-local exception state.

Known issues
------------

* Since bpf_throw is marked noreturn, the compiler sometimes may determine
  that a function always throws and emit the final instruction as a call
  to it without emitting an exit in the caller. This leads to an error
  where the verifier complains about final instruction not being a jump,
  exit, or bpf_throw call (which gets treated as an exit). This is
  unlikely to occur as bpf_throw wouldn't be used whenever the condition
  is already known at compile time, but I could see it when testing with
  always throwing subprogs and calling into them.

* Just asm volatile ("call bpf_throw" :::) does not emit DATASEC .ksyms
  for bpf_throw, there needs to be explicit call in C for clang to emit
  the DATASEC info in BTF, leading to errors during compilation.

Kumar Kartikeya Dwivedi (9):
  bpf: Fix kfunc callback handling
  bpf: Refactor and generalize optimize_bpf_loop
  bpf: Implement bpf_throw kfunc
  bpf: Handle throwing BPF callbacks in helpers and kfuncs
  bpf: Add pass to fixup global function throw information
  bpf: Add KF_THROW annotation for kfuncs
  bpf: Introduce bpf_set_exception_callback kfunc
  bpf: Introduce BPF assertion macros
  selftests/bpf: Add tests for BPF exceptions

 include/linux/bpf.h                           |   9 +-
 include/linux/bpf_verifier.h                  |  20 +-
 include/linux/btf.h                           |   1 +
 include/linux/sched.h                         |   1 +
 kernel/bpf/arraymap.c                         |  13 +-
 kernel/bpf/bpf_iter.c                         |   2 +
 kernel/bpf/hashtab.c                          |   4 +-
 kernel/bpf/helpers.c                          |  46 +-
 kernel/bpf/ringbuf.c                          |   4 +
 kernel/bpf/syscall.c                          |  10 +
 kernel/bpf/task_iter.c                        |   2 +
 kernel/bpf/trampoline.c                       |   4 +-
 kernel/bpf/verifier.c                         | 692 +++++++++++++++++-
 net/bpf/test_run.c                            |  12 +
 .../testing/selftests/bpf/bpf_experimental.h  |  37 +
 .../selftests/bpf/prog_tests/exceptions.c     | 240 ++++++
 .../testing/selftests/bpf/progs/exceptions.c  | 218 ++++++
 .../selftests/bpf/progs/exceptions_ext.c      |  42 ++
 .../selftests/bpf/progs/exceptions_fail.c     | 267 +++++++
 19 files changed, 1576 insertions(+), 48 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/exceptions.c
 create mode 100644 tools/testing/selftests/bpf/progs/exceptions.c
 create mode 100644 tools/testing/selftests/bpf/progs/exceptions_ext.c
 create mode 100644 tools/testing/selftests/bpf/progs/exceptions_fail.c

base-commit: d099f594ad5650e8d8232b5f31f5f90104e65def
-- 
2.40.0