Re: Re: Recent eBPF verifier rejects program that was accepted by 6.8 eBPF verifier

Francis Deslauriers <francis.deslauriers@xxxxxxxxxxxxxxx> · Wed, 20 Nov 2024 21:55:54 +0000

Hi Maxim,

On Wed, 2024-11-20 at 22:33 +0200, Maxim Mikityanskiy wrote:
> Hi Francis,
> 
> On Fri, 15 Nov 2024 at 22:59, Francis Deslauriers
> <francis.deslauriers@xxxxxxxxxxxxxxx> wrote:
> > I added a stripped down libbpf reproducer based on the libbpf-
> > bootstrap repo to
> > showcase this issue (see below). Note that if I change the
> > RULE_LIST_SIZE from
> > 380 to 30, the verifier accepts the program.
> > 
> > I bisected the issue down to this commit:
> >         commit 8ecfc371d829bfed75e0ef2cab45b2290b982f64
> >         Author: Maxim Mikityanskiy <maxim@xxxxxxxxxxxxx>
> >         Date:   Mon Jan 8 22:52:02 2024 +0200
> > 
> >         bpf: Assign ID to scalars on spill
> 
> This commit was part of a series with a few new features that are
> indeed expected to raise the complexity with some programs. To
> compensate for it, a patch by Eduard was submitted within the same
> series. Could you please test your program on this kernel commit?
> 
> 6efbde200bf3 bpf: Handle scalar spill vs all MISC in stacksafe()
That commit was included in the commit from master that I tested.
> 
> I.e. whether it passes or fails, and If it fails, what's the biggest
> RULE_LIST_SIZE that passes. Let's see if Eduard's patch helps
> partially or doesn't address this issue at all (or helps fully, but
> there is another regression after his commit).
> 
> > It's my understanding that a eBPF program that is accepted by one
> > version of the verifier should be accepted on all subsequent
> > versions.
> 
> That's basically the goal. Obviously, some new features will increase
> the verification complexity, but we are trying to compensate for it
> or
> to make it insignificant.
> 
> For example, when I introduced my changes (with Eduard's patch), I
> tested the complexity before and after on the set of BPF programs
> that
> included kernel selftests and Cilium:
> 
> https://urldefense.com/v3/__https://patchwork.kernel.org/project/netdevbpf/cover/20240108205209.838365-1-maxtram95@xxxxxxxxx/__;!!BmdzS3_lV9HdKG8!0PQcMqrhQi11Pcrc3XFRWe8ktC7OEKTEQaqrNJU-Bs7nIYJZBsLBS7SUJh13nWpCApaZhIAmVrFy8kl9s2FW6puJcTE$
>  
> https://urldefense.com/v3/__https://patchwork.kernel.org/project/netdevbpf/cover/20240127175237.526726-1-maxtram95@xxxxxxxxx/__;!!BmdzS3_lV9HdKG8!0PQcMqrhQi11Pcrc3XFRWe8ktC7OEKTEQaqrNJU-Bs7nIYJZBsLBS7SUJh13nWpCApaZhIAmVrFy8kl9s2FWW01r49Y$
>  
> 
> As you can see, Eduard's patch really helped with most regressions,
> and the whole picture looked good enough. Apparently (if this series
> is indeed the culprit), your program uses some pattern that wasn't
> covered by the set of programs that I checked.
> 
> > I'm investigating how this commit affects how instructions are
> > counted
> >  and why it leads to such a drastic change for this particular
> > program.
> 
> I'd guess, most likely, something inside the loop body became more
> complex to verify, and it's repeated 380 times, further increasing
> the
> total complexity.
> 
> I wonder where 380 comes from. Is it just the maximum number of
> iterations that the old verifier could handle? Or does it have a
> deeper meaning?

That's it. It was the largest number of iterations that was accepted by
all the kernel verifier versions we support.
> 
> Is bpf_loop an option for you?

We are exploring new approaches for our future releases but our past
releases can't be changed.
> 
> > I wanted to share my findings early in case someone has any hints
> > for me.
> > 
> > To reproduce, use the following file as a drop in replacement of
> > libbpf-boostrap's examples/c/tc.bpf.c:
> 
> I reproduced the issue on 6.11.6, the highest RULE_LIST_SIZE that
> works for me is 35, but I currently lack time to take a deeper look.
> Do you have any new findings in the meanwhile?

No, I haven't found anything obvious. My current hypothesis is that
because we new assign IDs we lost some opportunities for state pruning.
In other words, states that were considered identical before the commit
are now considered distinct because of the IDs and thus can't be
pruned. This is a wild guess; I'm not very familiar with the verifier.

What's really odd is how big the jump is: from 380 to 35.

Cheers,
Francis