Re: [BUG] verifier escape with iteration helpers (bpf_loop, ...)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 2023-07-07 at 11:08 -0700, Andrii Nakryiko wrote:
> On Fri, Jul 7, 2023 at 9:44 AM Alexei Starovoitov
> <alexei.starovoitov@xxxxxxxxx> wrote:
> > 
> > On Fri, Jul 7, 2023 at 7:04 AM Andrew Werner <awerner32@xxxxxxxxx> wrote:
> > > 
> > > When it comes to fixing the problem, I don't quite know where to start.
> > > Perhaps these iteration callbacks ought to be treated more like global functions
> > > -- you can't always make assumptions about the state of the data in the context
> > > pointer. Treating the context pointer as totally opaque seems bad from
> > > a usability
> > > perspective. Maybe there's a way to attempt to verify the function
> > > body of the function
> > > by treating all or part of the context as read-only, and then if that
> > > fails, go back and
> > > assume nothing about that part of the context structure. What is the
> > > right way to
> > > think about plugging this hole?
> > 
> > 'treat as global' might be a way to fix it, but it will likely break
> > some setups, since people passing pointers in a context and current
> > global func verification doesn't support that.
> 
> yeah, the intended use case is to return something from callbacks
> through context pointer. So it will definitely break real uses.
> 
> > I think the simplest fix would be to disallow writes into any pointers
> > within a ctx. Writes to scalars should still be allowed.
> 
> It might be a partial mitigation, but even with SCALARs there could be
> problems due to assumed SCALAR range, which will be invalid if
> callback is never called or called many times.
> 
> > Much more complex fix would be to verify callbacks as
> > process_iter_next_call(). See giant comment next to it.
> 
> yep, that seems like the right way forward. We need to simulate 0, 1,
> 2, .... executions of callbacks until we validate and exhaust all
> possible context modification permutations, just like open-coded
> iterators do it
> 
> > But since we need a fix for stable I'd try the simple approach first.
> > Could you try to implement that?

Hi All,

This issue seems stalled, so I took a look over the weekend.
I have a work in progress fix, please let me know if you don't agree
with direction I'm taking or if I'm stepping on anyone's toes.

After some analysis I decided to go with Alexei's suggestion and
implement something similar to iterators for selected set of helper
functions that accept "looping" callbacks, such as:
- bpf_loop
- bpf_for_each_map_elem
- bpf_user_ringbuf_drain
- bpf_timer_set_callback (?)

The sketch of the fix looks as follows:
- extend struct bpf_func_state with callback iterator state information:
  - active/non-active flag
  - depth
  - r1-r5 state at the entry to callback;
- extend __check_func_call() to setup callback iterator state when
  call to suitable helper function is processed;
- extend BPF_EXIT processing (prepare_func_exit()) to queue new
  callback visit upon return from iterating callback
  (similar to process_iter_next_call());
- extend is_state_visited() to account for callback iterator hits in a
  way similar to regular iterators;
- extend backtrack_insn() to correctly react to jumps from callback
  exit to callback entry.
  
I have a patch (at the end of this email) that correctly recognizes
the bpf_loop example in this thread as unsafe. However this patch has
a few deficiencies:

- verif_scale_strobemeta_bpf_loop selftest is not passing, because
  read_map_var function invoked from the callback continuously
  increments 'payload' pointer used in subsequent iterations.
  
  In order to handle such code I need to add an upper bound tracking
  for iteration depth (when such bound could be deduced).

- loop detection is broken for simple callback as below:

  static int loop_callback_infinite(__u32 idx, __u64 *data)
  {
      for (;;)
          (*ctx)++;
      return 0;
  }
  
  To handle such code I need to change is_state_visited() to do
  callback iterator loop/hit detection on exit from callback
  (returns are not prune points at the moment), currently it is done
  on entry.

- the callback as bellow currently causes state explosion:

  static int precise1_callback(__u32 idx, struct precise1_ctx *ctx)
  {
      if (idx == 0)
          ctx->a = 1;
      if (idx == 1 && ctx->a == 1)
          ctx->b = 1;
      return 0;
  }
  
  I'm not sure yet what to do about this, there are several possibilities:
  - tweak the order in which states are visited (need to think about it);
  - check states in bpf_verifier_env::head (not explored yet) for
    equivalence and avoid enqueuing duplicate states.
    
I'll proceed addressing the issues above on Monday.

Thanks,
Eduard

---

diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index a3236651ec64..5589f55e42ba 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -70,6 +70,17 @@ enum bpf_iter_state {
 	BPF_ITER_STATE_DRAINED,
 };
 
+struct bpf_iter {
+	/* BTF container and BTF type ID describing
+	 * struct bpf_iter_<type> of an iterator state
+	 */
+	struct btf *btf;
+	u32 btf_id;
+	/* packing following two fields to fit iter state into 16 bytes */
+	enum bpf_iter_state state:2;
+	int depth:30;
+};
+
 struct bpf_reg_state {
 	/* Ordering of fields matters.  See states_equal() */
 	enum bpf_reg_type type;
@@ -115,16 +126,7 @@ struct bpf_reg_state {
 		} dynptr;
 
 		/* For bpf_iter stack slots */
-		struct {
-			/* BTF container and BTF type ID describing
-			 * struct bpf_iter_<type> of an iterator state
-			 */
-			struct btf *btf;
-			u32 btf_id;
-			/* packing following two fields to fit iter state into 16 bytes */
-			enum bpf_iter_state state:2;
-			int depth:30;
-		} iter;
+		struct bpf_iter iter;
 
 		/* Max size from any of the above. */
 		struct {
@@ -300,6 +302,8 @@ struct bpf_func_state {
 	bool in_callback_fn;
 	struct tnum callback_ret_range;
 	bool in_async_callback_fn;
+	struct bpf_iter callback_iter;
+	struct bpf_reg_state callback_entry_state[MAX_BPF_REG];
 
 	/* The following fields should be last. See copy_func_state() */
 	int acquired_refs;
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index dbba2b806017..e79a4bec4461 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -2357,6 +2357,7 @@ static void init_func_state(struct bpf_verifier_env *env,
 	state->callback_ret_range = tnum_range(0, 0);
 	init_reg_state(env, state);
 	mark_verifier_state_scratched(env);
+	state->callback_iter.state = BPF_ITER_STATE_INVALID;
 }
 
 /* Similar to push_stack(), but for async callbacks */
@@ -3599,6 +3600,39 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx, int subseq_idx,
 			}
 		} else if (opcode == BPF_EXIT) {
 			bool r0_precise;
+			int subprog;
+
+			/* Jump from 'exit' to start of the same subprogram,
+			 * this happens for callback iteration, e.g.:
+			 *
+			 *   int cb_func(...) {
+			 *   start:
+			 *     ...
+			 *     return 0; // <-- BPF_EXIT processing in do_check()
+			 *               //     adds a state with IP set to 'start'.
+			 *   }
+			 *   bpf_loop(..., cb_func, ...);
+			 *
+			 * Clear r1-r5 as in the callback case above, but do
+			 * not change frame number.
+			 */
+			if (subseq_idx >= 0 &&
+			    ((subprog = find_subprog(env, subseq_idx)) >= 0) &&
+			    idx >= subseq_idx &&
+			    (subprog >= env->subprog_cnt ||
+			     idx < env->subprog_info[subprog + 1].start)) {
+				if (bt_reg_mask(bt) & ~BPF_REGMASK_ARGS) {
+					verbose(env, "BUG regs %x\n", bt_reg_mask(bt));
+					WARN_ONCE(1, "verifier backtracking bug");
+					return -EFAULT;
+				}
+				if (bt_stack_mask(bt) != 0)
+					return -ENOTSUPP;
+				/* clear r1-r5 in callback subprog's mask */
+				for (i = BPF_REG_1; i <= BPF_REG_5; i++)
+					bt_clear_reg(bt, i);
+				return 0;
+			}
 
 			if (bt_reg_mask(bt) & BPF_REGMASK_ARGS) {
 				/* if backtracing was looking for registers R1-R5
@@ -8869,13 +8903,17 @@ static int set_callee_state(struct bpf_verifier_env *env,
 			    struct bpf_func_state *caller,
 			    struct bpf_func_state *callee, int insn_idx);
 
+static int set_loop_callback_state(struct bpf_verifier_env *env,
+				   struct bpf_func_state *caller,
+				   struct bpf_func_state *callee, int insn_idx);
+
 static int __check_func_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
 			     int *insn_idx, int subprog,
 			     set_callee_state_fn set_callee_state_cb)
 {
 	struct bpf_verifier_state *state = env->cur_state;
 	struct bpf_func_state *caller, *callee;
-	int err;
+	int err, i;
 
 	if (state->curframe + 1 >= MAX_CALL_FRAMES) {
 		verbose(env, "the call stack of %d frames is too deep\n",
@@ -8972,7 +9010,6 @@ static int __check_func_call(struct bpf_verifier_env *env, struct bpf_insn *insn
 			*insn_idx /* callsite */,
 			state->curframe + 1 /* frameno within this callchain */,
 			subprog /* subprog number within this prog */);
-
 	/* Transfer references to the callee */
 	err = copy_reference_state(callee, caller);
 	if (err)
@@ -8982,6 +9019,14 @@ static int __check_func_call(struct bpf_verifier_env *env, struct bpf_insn *insn
 	if (err)
 		goto err_out;
 
+	if (set_callee_state_cb == set_loop_callback_state) {
+		callee->callback_iter.state = BPF_ITER_STATE_ACTIVE;
+		callee->callback_iter.depth = 0;
+		for (i = BPF_REG_0; i < MAX_BPF_REG; ++i)
+			copy_register_state(&callee->callback_entry_state[i],
+					    &callee->regs[i]);
+	}
+
 	clear_caller_saved_regs(env, caller->regs);
 
 	/* only increment it after check_reg_arg() finished */
@@ -9256,7 +9301,7 @@ static int prepare_func_exit(struct bpf_verifier_env *env, int *insn_idx)
 	struct bpf_verifier_state *state = env->cur_state;
 	struct bpf_func_state *caller, *callee;
 	struct bpf_reg_state *r0;
-	int err;
+	int err, i;
 
 	callee = state->frame[state->curframe];
 	r0 = &callee->regs[BPF_REG_0];
@@ -9301,6 +9346,25 @@ static int prepare_func_exit(struct bpf_verifier_env *env, int *insn_idx)
 			return err;
 	}
 
+	if (callee->in_callback_fn && callee->callback_iter.state == BPF_ITER_STATE_ACTIVE) {
+		struct bpf_verifier_state *queued_st;
+		struct bpf_func_state *queued_callee;
+		struct bpf_iter *queued_iter;
+
+		queued_st = push_stack(env, env->subprog_info[callee->subprogno].start,
+				       *insn_idx, false);
+		if (!queued_st)
+			return -ENOMEM;
+
+		queued_callee = queued_st->frame[callee->frameno];
+		queued_iter = &queued_callee->callback_iter;
+		queued_iter->depth++;
+
+		for (i = BPF_REG_0; i < MAX_BPF_REG; ++i)
+			copy_register_state(&queued_callee->regs[i],
+					    &queued_callee->callback_entry_state[i]);
+	}
+
 	*insn_idx = callee->callsite + 1;
 	if (env->log.level & BPF_LOG_LEVEL) {
 		verbose(env, "returning from callee:\n");
@@ -16112,6 +16176,15 @@ static bool is_iter_next_insn(struct bpf_verifier_env *env, int insn_idx)
 	return env->insn_aux_data[insn_idx].is_iter_next;
 }
 
+static bool is_callback_iter_entry(struct bpf_verifier_env *env, int insn_idx)
+{
+	struct bpf_func_state *cur_frame = cur_func(env);
+
+	return cur_frame->in_callback_fn &&
+	       env->subprog_info[cur_frame->subprogno].start == insn_idx &&
+	       cur_frame->callback_iter.state != BPF_ITER_STATE_INVALID;
+}
+
 /* is_state_visited() handles iter_next() (see process_iter_next_call() for
  * terminology) calls specially: as opposed to bounded BPF loops, it *expects*
  * states to match, which otherwise would look like an infinite loop. So while
@@ -16190,6 +16263,9 @@ static bool iter_active_depths_differ(struct bpf_verifier_state *old, struct bpf
 			if (cur_slot->iter.depth != slot->iter.depth)
 				return true;
 		}
+		if (state->callback_iter.state == BPF_ITER_STATE_ACTIVE &&
+		    state->callback_iter.depth != cur->frame[fr]->callback_iter.depth)
+			return true;
 	}
 	return false;
 }
@@ -16277,6 +16353,12 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
 				}
 				goto skip_inf_loop_check;
 			}
+			if (is_callback_iter_entry(env, insn_idx)) {
+				if (states_equal(env, &sl->state, cur) &&
+				    cur_func(env)->callback_iter.state == BPF_ITER_STATE_ACTIVE)
+					goto hit;
+				goto skip_inf_loop_check;
+			}
 			/* attempt to detect infinite loop to avoid unnecessary doomed work */
 			if (states_maybe_looping(&sl->state, cur) &&
 			    states_equal(env, &sl->state, cur) &&






[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux