On Mon, 2025-01-06 at 18:15 +0100, Arthur Fabre wrote: > bpf_tail_call(), LD_ABS, and LD_IND can cause the current function to > return abnormally: > - On success, bpf_tail_call() will jump to the tail_called program, and > that program will return directly to the outer caller. > - On failure, LD_ABS or LD_IND return directly to the outer caller. > > But the verifier doesn't account for these abnormal exits, so it assumes > the instructions following a bpf_tail_call() or LD_ABS are always > executed, and updates bounds info accordingly. > > Before BPF to BPF calls that was ok: the whole BPF program would > terminate anyways, so it didn't matter that the verifier state didn't > match reality. > > But if these instructions are used in a function call, the verifier will > propagate some of this incorrect bounds info to the caller. There are at > least two kinds of this: > - The callee's return value in the caller. > - References to the caller's stack passed into the caller. > > For example, loading: > > #include <linux/bpf.h> > #include <bpf/bpf_helpers.h> > > struct { > __uint(type, BPF_MAP_TYPE_PROG_ARRAY); > __uint(max_entries, 1); > __uint(key_size, sizeof(__u32)); > __uint(value_size, sizeof(__u32)); > } tail_call_map SEC(".maps"); > > static __attribute__((noinline)) int callee(struct xdp_md *ctx) > { > bpf_tail_call(ctx, &tail_call_map, 0); > > int ret; > asm volatile("%0 = 23" : "=r"(ret)); > return ret; > } > > static SEC("xdp") int caller(struct xdp_md *ctx) > { > int res = callee(ctx); > if (res == 23) { > return XDP_PASS; > } > return XDP_DROP; > } > > The verifier logs: > > func#0 @0 > func#1 @6 > 0: R1=ctx() R10=fp0 > ; int res = callee(ctx); @ test.c:24 > 0: (85) call pc+5 > caller: > R10=fp0 > callee: > frame1: R1=ctx() R10=fp0 > 6: frame1: R1=ctx() R10=fp0 > ; bpf_tail_call(ctx, &tail_call_map, 0); @ test.c:15 > 6: (18) r2 = 0xffff8a9c82a75800 ; frame1: R2_w=map_ptr(map=tail_call_map,ks=4,vs=4) > 8: (b4) w3 = 0 ; frame1: R3_w=0 > 9: (85) call bpf_tail_call#12 > 10: frame1: > ; asm volatile("%0 = 23" : "=r"(ret)); @ test.c:18 > 10: (b7) r0 = 23 ; frame1: R0_w=23 > ; return ret; @ test.c:19 > 11: (95) exit > returning from callee: > frame1: R0_w=23 R10=fp0 > to caller at 1: > R0_w=23 R10=fp0 > > from 11 to 1: R0_w=23 R10=fp0 > ; int res = callee(ctx); @ test.c:24 > 1: (bc) w1 = w0 ; R0_w=23 R1_w=23 > 2: (b4) w0 = 2 ; R0=2 > ; @ test.c:0 > 3: (16) if w1 == 0x17 goto pc+1 > 3: R1=23 > ; } @ test.c:29 > 5: (95) exit > processed 10 insns (limit 1000000) max_states_per_insn 0 total_states 1 peak_states 1 mark_read 1 > > And tracks R0_w=23 from the callee through to the caller. > This lets it completely prune the res != 23 branch, skipping over > instruction 4. > > But this isn't sound: the bpf_tail_call() could make the callee return > before r0 = 23. > > Aside from pruning incorrect branches, this can also be used to read and > write arbitrary memory by using r0 as a index. > > Make the verifier track instructions that can return abnormally as a > branch that either exits, or falls through to the remaining > instructions. > > This naturally checks for resource leaks, so we can remove the explicit > checks for tail_calls and LD_ABS. > > Fixes: f4d7e40a5b71 ("bpf: introduce function calls (verification)") > Signed-off-by: Arthur Fabre <afabre@xxxxxxxxxxxxxx> > Cc: stable@xxxxxxxxxxxxxxx > --- This patch is correct as far as I can tell. Acked-by: Eduard Zingerman <eddyz87@xxxxxxxxx> [...] > @@ -18770,6 +18780,21 @@ static int do_check(struct bpf_verifier_env *env) > return err; > > mark_reg_scratched(env, BPF_REG_0); > + > + if (insn->src_reg == 0 && insn->imm == BPF_FUNC_tail_call) { > + /* Explore both cases: tail_call fails and we fallthrough, > + * or it succeeds and we exit the current function. > + */ > + if (!push_stack(env, env->insn_idx + 1, env->insn_idx, false)) > + return -ENOMEM; > + /* bpf_tail_call() doesn't set r0 on failure / in the fallthrough case. > + * But it does on success, so we have to mark it after queueing the > + * fallthrough case, but before prepare_func_exit(). > + */ > + __mark_reg_unknown(env, &state->frame[state->curframe]->regs[BPF_REG_0]); > + exit = BPF_EXIT_TAIL_CALL; > + goto process_bpf_exit_full; > + } Nit: it's a bit unfortunate, that this logic is inside do_check, instead of check_helper_call() and check_ld_abs(). But it makes BPF_EXIT_* propagation simpler. > } else if (opcode == BPF_JA) { > if (BPF_SRC(insn->code) != BPF_K || > insn->src_reg != BPF_REG_0 || [...]