Re: [PATCH bpf-next 07/10] bpf: fix mark_all_scalars_precise use in mark_chain_precision

Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> · Thu, 4 May 2023 15:47:53 -0700

On Thu, May 04, 2023 at 03:39:50PM -0700, Andrii Nakryiko wrote:
> On Thu, May 4, 2023 at 3:32 PM Alexei Starovoitov
> <alexei.starovoitov@xxxxxxxxx> wrote:
> >
> > On Thu, May 04, 2023 at 03:00:06PM -0700, Andrii Nakryiko wrote:
> > > On Thu, May 4, 2023 at 9:37 AM Alexei Starovoitov
> > > <alexei.starovoitov@xxxxxxxxx> wrote:
> > > >
> > > > On Tue, Apr 25, 2023 at 04:49:08PM -0700, Andrii Nakryiko wrote:
> > > > > When precision backtracking bails out due to some unsupported sequence
> > > > > of instructions (e.g., stack access through register other than r10), we
> > > > > need to mark all SCALAR registers as precise to be safe. Currently,
> > > > > though, we mark SCALARs precise only starting from the state we detected
> > > > > unsupported condition, which could be one of the parent states of the
> > > > > actual current state. This will leave some registers potentially not
> > > > > marked as precise, even though they should. So make sure we start
> > > > > marking scalars as precise from current state (env->cur_state).
> > > > >
> > > > > Further, we don't currently detect a situation when we end up with some
> > > > > stack slots marked as needing precision, but we ran out of available
> > > > > states to find the instructions that populate those stack slots. This is
> > > > > akin the `i >= func->allocated_stack / BPF_REG_SIZE` check and should be
> > > > > handled similarly by falling back to marking all SCALARs precise. Add
> > > > > this check when we run out of states.
> > > > >
> > > > > Signed-off-by: Andrii Nakryiko <andrii@xxxxxxxxxx>
> > > > > ---
> > > > >  kernel/bpf/verifier.c                          | 16 +++++++++++++---
> > > > >  tools/testing/selftests/bpf/verifier/precise.c |  9 +++++----
> > > > >  2 files changed, 18 insertions(+), 7 deletions(-)
> > > > >
> > > > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > > > > index 66d64ac10fb1..35f34c977819 100644
> > > > > --- a/kernel/bpf/verifier.c
> > > > > +++ b/kernel/bpf/verifier.c
> > > > > @@ -3781,7 +3781,7 @@ static int __mark_chain_precision(struct bpf_verifier_env *env, int frame, int r
> > > > >                               err = backtrack_insn(env, i, bt);
> > > > >                       }
> > > > >                       if (err == -ENOTSUPP) {
> > > > > -                             mark_all_scalars_precise(env, st);
> > > > > +                             mark_all_scalars_precise(env, env->cur_state);
> > > > >                               bt_reset(bt);
> > > > >                               return 0;
> > > > >                       } else if (err) {
> > > > > @@ -3843,7 +3843,7 @@ static int __mark_chain_precision(struct bpf_verifier_env *env, int frame, int r
> > > > >                                        * fp-8 and it's "unallocated" stack space.
> > > > >                                        * In such case fallback to conservative.
> > > > >                                        */
> > > > > -                                     mark_all_scalars_precise(env, st);
> > > > > +                                     mark_all_scalars_precise(env, env->cur_state);
> > > > >                                       bt_reset(bt);
> > > > >                                       return 0;
> > > > >                               }
> > > > > @@ -3872,11 +3872,21 @@ static int __mark_chain_precision(struct bpf_verifier_env *env, int frame, int r
> > > > >               }
> > > > >
> > > > >               if (bt_bitcnt(bt) == 0)
> > > > > -                     break;
> > > > > +                     return 0;
> > > > >
> > > > >               last_idx = st->last_insn_idx;
> > > > >               first_idx = st->first_insn_idx;
> > > > >       }
> > > > > +
> > > > > +     /* if we still have requested precise regs or slots, we missed
> > > > > +      * something (e.g., stack access through non-r10 register), so
> > > > > +      * fallback to marking all precise
> > > > > +      */
> > > > > +     if (bt_bitcnt(bt) != 0) {
> > > > > +             mark_all_scalars_precise(env, env->cur_state);
> > > > > +             bt_reset(bt);
> > > > > +     }
> > > >
> > > > We get here only after:
> > > >   st = st->parent;
> > > >   if (!st)
> > > >           break;
> > > >
> > > > which is the case when we reach the very beginning of the program (parent == NULL) and
> > > > there are still regs or stack with marks.
> > > > That's a situation when backtracking encountered something we didn't foresee. Some new
> > > > condition. Currently we don't have selftest that trigger this.
> > > > So as a defensive mechanism it makes sense to do mark_all_scalars_precise(env, env->cur_state);
> > > > Probably needs verbose("verifier backtracking bug") as well.
> > >
> > > I hesitated to add a bug message because of a known case where this
> > > could happen: stack access through non-r10 register. So it's not a
> > > bug, it's similar to -ENOTSUPP cases above, kind of expected, if rare.
> >
> > fair enough. I'm ok to skip verbose().
> >
> > > >
> > > > But for the other two cases mark_all_scalars_precise(env, st); is safe.
> > > > What's the reason to mark everything precise from the very beginning of backtracking (last seen state == current state).
> > > > Since unsupported condition was in the middle it's safe to mark from that condition till start of prog.
> > >
> > > So I don't have a constructed example and it's more of a hunch. But
> > > let's use the same stack slot precision when some instructions use
> > > non-r10 registers. We can't always reliably and immediately detect
> > > this situation, so the point at which we realize that something is off
> > > could be in parent state, but that same slot might be used in children
> > > state and if we don't mark everything precise from the current state,
> > > then we are running a risk of leaving some of the slots as imprecise.
> > >
> > > Before patch #8 similar situation could be also with r0 returned from
> > > static subprog. Something like this:
> > >
> > >
> > > P2: call subprog
> > > <<- checkpoint ->>
> > > P1: r1 = r0;
> > > <<- checkpoint ->>
> > > CUR: r2 = <map value pointer>
> > >      r2 += r1   <--- need to mark r0 precise and propagate all the way
> > > to P2 and beyond
> > >
> > >
> > > Precision propagation will be triggered in CUR state. Will go through
> > > P1, come back to P2 and (before patch #8 changes) will realize this is
> > > unsupported case. Currently we'll mark R0 precise only in P2 and its
> > > parents (could be also r6-r9 registers situation).
> >
> > Correct. I'm with you so far...
> >
> > > But really we
> > > should make sure that P1 also has r1 marked as precise.
> >
> > and here I'm a bit lost.
> > Precision marking will do r1->precise=true while walking P1 in that state.
> >
> > The change:
> > mark_all_scalars_precise(env, env->cur_state);
> > will mark both r0 and r1 in P1, but that's unnecessary.
> >
> > > So as a general rule, I think the only safe default is to mark
> > > everything precise from current state.
> > >
> > > Is the above convincing or should I revert back to old behavior?
> >
> > I'm not seeing the issue yet.
> 
> you are right, second example is not relevant because we'll be
> properly propagating register precision along the way. So I think
> stack access is the only one that might be problematic.
> 
> I assume you are worried about regressing due to marking more stuff
> precise, right? My thinking (besides the subtle correctness issues
> with stack access through non-r10 registers) was that all the other
> patterns should be now handled properly, so we shouldn't be even
> seeing mark_all_precise() use in practice.
> 
> But I'm fine reverting those two existing mark_all_precise() calls, if
> you insist.

Regressing due to this was the main concern, but since you've run
all possible things with veristat I guess it's fine then.
It doesn't look like extra pessimism hurts too much.
I'm guessing this change was the reason for small increase in cilium progs.