Re: [PATCH bpf-next 5/7] bpf: Mark potential spilled loop index variable as precise

Yonghong Song <yhs@xxxxxxxx> · Thu, 6 Apr 2023 09:55:37 -0700

On 4/4/23 3:09 PM, Andrii Nakryiko wrote:
On Wed, Mar 29, 2023 at 10:56 PM Yonghong Song <yhs@xxxxxx> wrote:

For a loop, if loop index variable is spilled and between loop
iterations, the only reg/spill state difference is spilled loop
index variable, then verifier may assume an infinite loop which
cause verification failure. In such cases, we should mark
spilled loop index variable as precise to differentiate states
between loop iterations.

Since verifier is not able to accurately identify loop index
variable, add a heuristic such that if both old reg state and
new reg state are consts, mark old reg state as precise which
will trigger constant value comparison later.

Signed-off-by: Yonghong Song <yhs@xxxxxx>
---
  kernel/bpf/verifier.c | 20 ++++++++++++++++++--
  1 file changed, 18 insertions(+), 2 deletions(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index d070943a8ba1..d1aa2c7ae7c0 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -14850,6 +14850,23 @@ static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old,
                 /* Both old and cur are having same slot_type */
                 switch (old->stack[spi].slot_type[BPF_REG_SIZE - 1]) {
                 case STACK_SPILL:
+                       /* sometime loop index variable is spilled and the spill
+                        * is not marked as precise. If only state difference
+                        * between two iterations are spilled loop index, the
+                        * "infinite loop detected at insn" error will be hit.
+                        * Mark spilled constant as precise so it went through value
+                        * comparison.
+                        */
+                       old_reg = &old->stack[spi].spilled_ptr;
+                       cur_reg = &cur->stack[spi].spilled_ptr;
+                       if (!old_reg->precise) {
+                               if (old_reg->type == SCALAR_VALUE &&
+                                   cur_reg->type == SCALAR_VALUE &&
+                                   tnum_is_const(old_reg->var_off) &&
+                                   tnum_is_const(cur_reg->var_off))
+                                       old_reg->precise = true;
+                       }
+

I'm very worried about heuristics like this. Thinking in abstract, if
scalar's value is important for some loop invariant and would
guarantee some jump to be always taken or not taken, then jump
instruction prediction logic should mark register (and then by
precision backtrack stack slot) as precise. But if precise values
don't guarantee only one branch being taken, then marking the slot as
precise makes no sense.

Let's be very diligent with changes like this. I think your other
patches should help already with marking necessary slots as precise,
can you double check that this issue still happens. And if yes, let's
address them as a separate feature. The rest of verifier logic changes
in this patch set look good to me and make total sense.

Yes, this is a heuristic so it will mark precise for non-induction 
variables as well. Let me do a little more study on this. Just posted v2 
without this patch and its corresponding tests.



                         /* when explored and current stack slot are both storing
                          * spilled registers, check that stored pointers types
                          * are the same as well.
@@ -14860,8 +14877,7 @@ static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old,
                          * such verifier states are not equivalent.
                          * return false to continue verification of this path
                          */
-                       if (!regsafe(env, &old->stack[spi].spilled_ptr,
-                                    &cur->stack[spi].spilled_ptr, idmap))
+                       if (!regsafe(env, old_reg, cur_reg, idmap))
                                 return false;
                         break;
                 case STACK_DYNPTR:
--
2.34.1