On Thu, May 04, 2023 at 03:39:50PM -0700, Andrii Nakryiko wrote: > On Thu, May 4, 2023 at 3:32 PM Alexei Starovoitov > <alexei.starovoitov@xxxxxxxxx> wrote: > > > > On Thu, May 04, 2023 at 03:00:06PM -0700, Andrii Nakryiko wrote: > > > On Thu, May 4, 2023 at 9:37 AM Alexei Starovoitov > > > <alexei.starovoitov@xxxxxxxxx> wrote: > > > > > > > > On Tue, Apr 25, 2023 at 04:49:08PM -0700, Andrii Nakryiko wrote: > > > > > When precision backtracking bails out due to some unsupported sequence > > > > > of instructions (e.g., stack access through register other than r10), we > > > > > need to mark all SCALAR registers as precise to be safe. Currently, > > > > > though, we mark SCALARs precise only starting from the state we detected > > > > > unsupported condition, which could be one of the parent states of the > > > > > actual current state. This will leave some registers potentially not > > > > > marked as precise, even though they should. So make sure we start > > > > > marking scalars as precise from current state (env->cur_state). > > > > > > > > > > Further, we don't currently detect a situation when we end up with some > > > > > stack slots marked as needing precision, but we ran out of available > > > > > states to find the instructions that populate those stack slots. This is > > > > > akin the `i >= func->allocated_stack / BPF_REG_SIZE` check and should be > > > > > handled similarly by falling back to marking all SCALARs precise. Add > > > > > this check when we run out of states. > > > > > > > > > > Signed-off-by: Andrii Nakryiko <andrii@xxxxxxxxxx> > > > > > --- > > > > > kernel/bpf/verifier.c | 16 +++++++++++++--- > > > > > tools/testing/selftests/bpf/verifier/precise.c | 9 +++++---- > > > > > 2 files changed, 18 insertions(+), 7 deletions(-) > > > > > > > > > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > > > > > index 66d64ac10fb1..35f34c977819 100644 > > > > > --- a/kernel/bpf/verifier.c > > > > > +++ b/kernel/bpf/verifier.c > > > > > @@ -3781,7 +3781,7 @@ static int __mark_chain_precision(struct bpf_verifier_env *env, int frame, int r > > > > > err = backtrack_insn(env, i, bt); > > > > > } > > > > > if (err == -ENOTSUPP) { > > > > > - mark_all_scalars_precise(env, st); > > > > > + mark_all_scalars_precise(env, env->cur_state); > > > > > bt_reset(bt); > > > > > return 0; > > > > > } else if (err) { > > > > > @@ -3843,7 +3843,7 @@ static int __mark_chain_precision(struct bpf_verifier_env *env, int frame, int r > > > > > * fp-8 and it's "unallocated" stack space. > > > > > * In such case fallback to conservative. > > > > > */ > > > > > - mark_all_scalars_precise(env, st); > > > > > + mark_all_scalars_precise(env, env->cur_state); > > > > > bt_reset(bt); > > > > > return 0; > > > > > } > > > > > @@ -3872,11 +3872,21 @@ static int __mark_chain_precision(struct bpf_verifier_env *env, int frame, int r > > > > > } > > > > > > > > > > if (bt_bitcnt(bt) == 0) > > > > > - break; > > > > > + return 0; > > > > > > > > > > last_idx = st->last_insn_idx; > > > > > first_idx = st->first_insn_idx; > > > > > } > > > > > + > > > > > + /* if we still have requested precise regs or slots, we missed > > > > > + * something (e.g., stack access through non-r10 register), so > > > > > + * fallback to marking all precise > > > > > + */ > > > > > + if (bt_bitcnt(bt) != 0) { > > > > > + mark_all_scalars_precise(env, env->cur_state); > > > > > + bt_reset(bt); > > > > > + } > > > > > > > > We get here only after: > > > > st = st->parent; > > > > if (!st) > > > > break; > > > > > > > > which is the case when we reach the very beginning of the program (parent == NULL) and > > > > there are still regs or stack with marks. > > > > That's a situation when backtracking encountered something we didn't foresee. Some new > > > > condition. Currently we don't have selftest that trigger this. > > > > So as a defensive mechanism it makes sense to do mark_all_scalars_precise(env, env->cur_state); > > > > Probably needs verbose("verifier backtracking bug") as well. > > > > > > I hesitated to add a bug message because of a known case where this > > > could happen: stack access through non-r10 register. So it's not a > > > bug, it's similar to -ENOTSUPP cases above, kind of expected, if rare. > > > > fair enough. I'm ok to skip verbose(). > > > > > > > > > > But for the other two cases mark_all_scalars_precise(env, st); is safe. > > > > What's the reason to mark everything precise from the very beginning of backtracking (last seen state == current state). > > > > Since unsupported condition was in the middle it's safe to mark from that condition till start of prog. > > > > > > So I don't have a constructed example and it's more of a hunch. But > > > let's use the same stack slot precision when some instructions use > > > non-r10 registers. We can't always reliably and immediately detect > > > this situation, so the point at which we realize that something is off > > > could be in parent state, but that same slot might be used in children > > > state and if we don't mark everything precise from the current state, > > > then we are running a risk of leaving some of the slots as imprecise. > > > > > > Before patch #8 similar situation could be also with r0 returned from > > > static subprog. Something like this: > > > > > > > > > P2: call subprog > > > <<- checkpoint ->> > > > P1: r1 = r0; > > > <<- checkpoint ->> > > > CUR: r2 = <map value pointer> > > > r2 += r1 <--- need to mark r0 precise and propagate all the way > > > to P2 and beyond > > > > > > > > > Precision propagation will be triggered in CUR state. Will go through > > > P1, come back to P2 and (before patch #8 changes) will realize this is > > > unsupported case. Currently we'll mark R0 precise only in P2 and its > > > parents (could be also r6-r9 registers situation). > > > > Correct. I'm with you so far... > > > > > But really we > > > should make sure that P1 also has r1 marked as precise. > > > > and here I'm a bit lost. > > Precision marking will do r1->precise=true while walking P1 in that state. > > > > The change: > > mark_all_scalars_precise(env, env->cur_state); > > will mark both r0 and r1 in P1, but that's unnecessary. > > > > > So as a general rule, I think the only safe default is to mark > > > everything precise from current state. > > > > > > Is the above convincing or should I revert back to old behavior? > > > > I'm not seeing the issue yet. > > you are right, second example is not relevant because we'll be > properly propagating register precision along the way. So I think > stack access is the only one that might be problematic. > > I assume you are worried about regressing due to marking more stuff > precise, right? My thinking (besides the subtle correctness issues > with stack access through non-r10 registers) was that all the other > patterns should be now handled properly, so we shouldn't be even > seeing mark_all_precise() use in practice. > > But I'm fine reverting those two existing mark_all_precise() calls, if > you insist. Regressing due to this was the main concern, but since you've run all possible things with veristat I guess it's fine then. It doesn't look like extra pessimism hurts too much. I'm guessing this change was the reason for small increase in cilium progs.