During testing, it was discovered that extensions to exception callbacks had no checks, upon running a testcase, the kernel ended up running off the end of a program having final call as bpf_throw, and hitting int3 instructions. The reason is that while the default exception callback would have reset the stack frame to return back to the main program's caller, the replacing extension program will simply return back to bpf_throw, which will instead return back to the program and the program will continue execution, now in an undefined state where anything could happen. The way to support extensions to an exception callback would be to mark the BPF_PROG_TYPE_EXT main subprog as an exception_cb, and prevent it from calling bpf_throw. This would make the JIT produce a prologue that restores saved registers and reset the stack frame. But let's not do that until there is a concrete use case for this, and simply disallow this for now. Similar issues will exist for fentry and fexit cases, where trampoline saves data on the stack when invoking exception callback, which however will then end up resetting the stack frame, and on return, the fexit program will never will invoked as the return address points to the main program's caller in the kernel. Instead of additional complexity and back and forth between the two stacks to enable such a use case, simply forbid it. One key point here to note is that currently X86_TAIL_CALL_OFFSET didn't require any modifications, even though we emit instructions before the corresponding endbr64 instruction. This is because we ensure that a main subprog never serves as an exception callback, and therefore the exception callback (which will be a global subprog) can never serve as the tail call target, eliminating any discrepancies. However, once we support a BPF_PROG_TYPE_EXT to also act as an exception callback, it will end up requiring change to the tail call offset to account for the extra instructions. For simplicitly, tail calls could be disabled for such targets. Noting the above, it appears better to wait for a concrete use case before choosing to permit extension programs to replace exception callbacks. As a precaution, we disable fentry and fexit for exception callbacks as well. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@xxxxxxxxx> --- kernel/bpf/helpers.c | 1 + kernel/bpf/verifier.c | 11 +++++++++++ 2 files changed, 12 insertions(+) diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c index 2c8e1ee97b71..7ff2a42f1996 100644 --- a/kernel/bpf/helpers.c +++ b/kernel/bpf/helpers.c @@ -2490,6 +2490,7 @@ __bpf_kfunc void bpf_throw(u64 cookie) */ kasan_unpoison_task_stack_below((void *)ctx.sp); ctx.aux->bpf_exception_cb(cookie, ctx.sp, ctx.bp); + WARN(1, "A call to BPF exception callback should never return\n"); } __diag_pop(); diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 0ba32b626320..21e37e46d792 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -19750,6 +19750,12 @@ int bpf_check_attach_target(struct bpf_verifier_log *log, bpf_log(log, "Subprog %s doesn't exist\n", tname); return -EINVAL; } + if (aux->func && aux->func[subprog]->aux->exception_cb) { + bpf_log(log, + "%s programs cannot attach to exception callback\n", + prog_extension ? "Extension" : "FENTRY/FEXIT"); + return -EINVAL; + } conservative = aux->func_info_aux[subprog].unreliable; if (prog_extension) { if (conservative) { @@ -19762,6 +19768,11 @@ int bpf_check_attach_target(struct bpf_verifier_log *log, "Extension programs should be JITed\n"); return -EINVAL; } + if (aux->func && aux->func[subprog]->aux->exception_cb) { + bpf_log(log, + "Extension programs cannot replace exception callback\n"); + return -EINVAL; + } } if (!tgt_prog->jited) { bpf_log(log, "Can attach to only JITed progs\n"); -- 2.41.0