Re: [PATCH bpf-next v1 1/2] bpf: states with loop entry have incomplete read/precision marks

Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> · Mon, 17 Mar 2025 14:46:47 -0700

On Mon, Mar 17, 2025 at 1:28 PM Eduard Zingerman <eddyz87@xxxxxxxxx> wrote:
>
> On Fri, 2025-03-14 at 19:51 -0700, Alexei Starovoitov wrote:
>
> [...]
>
> > Looks like the whole concept of old-style liveness and precision
> > is broken with loops.
> > propagate_liveness() will take marks from old state,
> > but old is incomplete, so propagating them into cur doesn't
> > make cur complete either.
> >
> > > Another possibility is to forgo loop entries altogether and upon
> > > states_equal(cached, cur, RANGE_WITHIN) mark all registers in the
> > > `cached` state as read and precise, propagating this info in `cur`.
> > > I'll try this as well.
> >
> > Have a gut feel that it won't work.
> > Currently we have loop_entry->branches is a flag of "completeness".
> > which doesn't work for loops,
> > so maybe we need a bool flag for looping states and instead of:
> > force_exact = loop_entry && complete
> > use
> > force_exact = loop_entry || incomplete
> >
> > looping state will have "incomplete" flag cleared only when branches == 0 ?
> > or maybe never.
> >
> > The further we get into this the more I think we need to get rid of
> > existing liveness, precision, and everything path sensitive and
> > convert it all to data flow analysis.
>
> In [1] I tried conservatively marking cached state registers as both
> read and precise when states_equal(cached, cur, RANGE_WITHIN) == true.
> It works for the example at hand, but indeed falls apart because it
> interferes with widening logic.

hindsight 20/20 :)

> So, anything like below can't be
> verified anymore (what was I thinking?):
>
>     SEC("raw_tp")
>     __success
>     int iter_nested_deeply_iters(const void *ctx)
>     {
>         int sum = 0;
>
>         MY_PID_GUARD();
>
>         bpf_repeat(10) {
>                 ...
>                 sum += 1;
>                 ...
>         }
>
>         return sum;
>     }
>
> Looks like there are only two options left:
> a. propagate read/precision marks in state graph until fixed point is
>    reached;
> b. switch to CFG based DFA analysis (will need slot/register type
>    propagation to identify which scalars can be precise).
>
> (a) is more in line with what we do currently and probably less code.
> But I expect that it would be harder to think about in case if
> something goes wrong (e.g. we don't print/draw states graph at the
> moment, I have a debug patch for this that I just cherry-pick).

if (a) is not a ton of code it's worth giving it a shot,
but while(!converged) propagate_precision()
will look scary.
We definitely need to do (b) at the end.