Re: [PATCH bpf v3 1/2] bpf: Account for early exit of bpf_tail_call() and LD_ABS

Eduard Zingerman <eddyz87@xxxxxxxxx> · Mon, 06 Jan 2025 12:31:27 -0800

On Mon, 2025-01-06 at 18:15 +0100, Arthur Fabre wrote:
> bpf_tail_call(), LD_ABS, and LD_IND can cause the current function to
> return abnormally:
> - On success, bpf_tail_call() will jump to the tail_called program, and
>   that program will return directly to the outer caller.
> - On failure, LD_ABS or LD_IND return directly to the outer caller.
> 
> But the verifier doesn't account for these abnormal exits, so it assumes
> the instructions following a bpf_tail_call() or LD_ABS are always
> executed, and updates bounds info accordingly.
> 
> Before BPF to BPF calls that was ok: the whole BPF program would
> terminate anyways, so it didn't matter that the verifier state didn't
> match reality.
> 
> But if these instructions are used in a function call, the verifier will
> propagate some of this incorrect bounds info to the caller. There are at
> least two kinds of this:
> - The callee's return value in the caller.
> - References to the caller's stack passed into the caller.
> 
> For example, loading:
> 
>     #include <linux/bpf.h>
>     #include <bpf/bpf_helpers.h>
> 
>     struct {
>             __uint(type, BPF_MAP_TYPE_PROG_ARRAY);
>             __uint(max_entries, 1);
>             __uint(key_size, sizeof(__u32));
>             __uint(value_size, sizeof(__u32));
>     } tail_call_map SEC(".maps");
> 
>     static __attribute__((noinline)) int callee(struct xdp_md *ctx)
>     {
>             bpf_tail_call(ctx, &tail_call_map, 0);
> 
>             int ret;
>             asm volatile("%0 = 23" : "=r"(ret));
>             return ret;
>     }
> 
>     static SEC("xdp") int caller(struct xdp_md *ctx)
>     {
>             int res = callee(ctx);
>             if (res == 23) {
>                     return XDP_PASS;
>             }
>             return XDP_DROP;
>     }
> 
> The verifier logs:
> 
>     func#0 @0
>     func#1 @6
>     0: R1=ctx() R10=fp0
>     ; int res = callee(ctx); @ test.c:24
>     0: (85) call pc+5
>     caller:
>      R10=fp0
>     callee:
>      frame1: R1=ctx() R10=fp0
>     6: frame1: R1=ctx() R10=fp0
>     ; bpf_tail_call(ctx, &tail_call_map, 0); @ test.c:15
>     6: (18) r2 = 0xffff8a9c82a75800       ; frame1: R2_w=map_ptr(map=tail_call_map,ks=4,vs=4)
>     8: (b4) w3 = 0                        ; frame1: R3_w=0
>     9: (85) call bpf_tail_call#12
>     10: frame1:
>     ; asm volatile("%0 = 23" : "=r"(ret)); @ test.c:18
>     10: (b7) r0 = 23                      ; frame1: R0_w=23
>     ; return ret; @ test.c:19
>     11: (95) exit
>     returning from callee:
>      frame1: R0_w=23 R10=fp0
>     to caller at 1:
>      R0_w=23 R10=fp0
> 
>     from 11 to 1: R0_w=23 R10=fp0
>     ; int res = callee(ctx); @ test.c:24
>     1: (bc) w1 = w0                       ; R0_w=23 R1_w=23
>     2: (b4) w0 = 2                        ; R0=2
>     ;  @ test.c:0
>     3: (16) if w1 == 0x17 goto pc+1
>     3: R1=23
>     ; } @ test.c:29
>     5: (95) exit
>     processed 10 insns (limit 1000000) max_states_per_insn 0 total_states 1 peak_states 1 mark_read 1
> 
> And tracks R0_w=23 from the callee through to the caller.
> This lets it completely prune the res != 23 branch, skipping over
> instruction 4.
> 
> But this isn't sound: the bpf_tail_call() could make the callee return
> before r0 = 23.
> 
> Aside from pruning incorrect branches, this can also be used to read and
> write arbitrary memory by using r0 as a index.
> 
> Make the verifier track instructions that can return abnormally as a
> branch that either exits, or falls through to the remaining
> instructions.
> 
> This naturally checks for resource leaks, so we can remove the explicit
> checks for tail_calls and LD_ABS.
> 
> Fixes: f4d7e40a5b71 ("bpf: introduce function calls (verification)")
> Signed-off-by: Arthur Fabre <afabre@xxxxxxxxxxxxxx>
> Cc: stable@xxxxxxxxxxxxxxx
> ---

This patch is correct as far as I can tell.

Acked-by: Eduard Zingerman <eddyz87@xxxxxxxxx>

[...]

> @@ -18770,6 +18780,21 @@ static int do_check(struct bpf_verifier_env *env)
>  					return err;
>  
>  				mark_reg_scratched(env, BPF_REG_0);
> +
> +				if (insn->src_reg == 0 && insn->imm == BPF_FUNC_tail_call) {
> +					/* Explore both cases: tail_call fails and we fallthrough,
> +					 * or it succeeds and we exit the current function.
> +					 */
> +					if (!push_stack(env, env->insn_idx + 1, env->insn_idx, false))
> +						return -ENOMEM;
> +					/* bpf_tail_call() doesn't set r0 on failure / in the fallthrough case.
> +					 * But it does on success, so we have to mark it after queueing the
> +					 * fallthrough case, but before prepare_func_exit().
> +					 */
> +					__mark_reg_unknown(env, &state->frame[state->curframe]->regs[BPF_REG_0]);
> +					exit = BPF_EXIT_TAIL_CALL;
> +					goto process_bpf_exit_full;
> +				}

Nit: it's a bit unfortunate, that this logic is inside do_check,
     instead of check_helper_call() and check_ld_abs().
     But it makes BPF_EXIT_* propagation simpler.

>  			} else if (opcode == BPF_JA) {
>  				if (BPF_SRC(insn->code) != BPF_K ||
>  				    insn->src_reg != BPF_REG_0 ||

[...]