Alexei Starovoitov wrote: > On Wed, Jun 19, 2019 at 05:24:32PM -0700, John Fastabend wrote: > > Alexei Starovoitov wrote: > > > Compilers often spill induction variables into the stack, > > > hence it is necessary for the verifier to track scalar values > > > of the registers through stack slots. > > > > > > Also few bpf programs were incorrectly rejected in the past, > > > since the verifier was not able to track such constants while > > > they were used to compute offsets into packet headers. > > > > > > Tracking constants through the stack significantly decreases > > > the chances of state pruning, since two different constants > > > are considered to be different by state equivalency. > > > End result that cilium tests suffer serious degradation in the number > > > of states processed and corresponding verification time increase. > > > > > > before after > > > bpf_lb-DLB_L3.o 1838 6441 > > > bpf_lb-DLB_L4.o 3218 5908 > > > bpf_lb-DUNKNOWN.o 1064 1064 > > > bpf_lxc-DDROP_ALL.o 26935 93790 > > > bpf_lxc-DUNKNOWN.o 34439 123886 > > > bpf_netdev.o 9721 31413 > > > bpf_overlay.o 6184 18561 > > > bpf_lxc_jit.o 39389 359445 > > > > > > After further debugging turned out that cillium progs are > > > getting hurt by clang due to the same constant tracking issue. > > > Newer clang generates better code by spilling less to the stack. > > > Instead it keeps more constants in the registers which > > > hurts state pruning since the verifier already tracks constants > > > in the registers: > > > old clang new clang > > > (no spill/fill tracking introduced by this patch) > > > bpf_lb-DLB_L3.o 1838 1923 > > > bpf_lb-DLB_L4.o 3218 3077 > > > bpf_lb-DUNKNOWN.o 1064 1062 > > > bpf_lxc-DDROP_ALL.o 26935 166729 > > > bpf_lxc-DUNKNOWN.o 34439 174607 > > ^^^^^^^^^^^^^^ > > Any idea what happened here? Going from 34439 -> 174607 on the new clang? > > As I was alluding in commit log newer clang is smarter and generates > less spill/fill of constants. > In particular older clang loads two constants into r8 and r9 > and immediately spills them into stack. Then fills later, > does a bunch of unrelated code and calls into helper that > has ARG_ANYTHING for that position. Then doing a bit more math > on filled constants, spills them again and so on. > Before this patch (that tracks spill/fill of constants into stack) > pruning points were equivalent, but with the patch it sees the difference > in registers and declares states not equivalent, though any constant > is fine from safety standpoint. > With new clang only r9 has this pattern of spill/fill. > New clang manages to keep constant in r8 to be around without spill/fill. > Existing verifier tracks constants so even without this patch > the same pathalogical behavior is observed. > The verifier need to walk a lot more instructions only because > r8 has different constants. > Got it I'll try out latest clang. > > > bpf_netdev.o 9721 8407 > > > bpf_overlay.o 6184 5420 > > > bpf_lcx_jit.o 39389 39389 > > > > > > The final table is depressing: > > > old clang old clang new clang new clang > > > const spill/fill const spill/fill > > > bpf_lb-DLB_L3.o 1838 6441 1923 8128 > > > bpf_lb-DLB_L4.o 3218 5908 3077 6707 > > > bpf_lb-DUNKNOWN.o 1064 1064 1062 1062 > > > bpf_lxc-DDROP_ALL.o 26935 93790 166729 380712 > > > bpf_lxc-DUNKNOWN.o 34439 123886 174607 440652 > > > bpf_netdev.o 9721 31413 8407 31904 > > > bpf_overlay.o 6184 18561 5420 23569 > > > bpf_lxc_jit.o 39389 359445 39389 359445 > > > > > > Tracking constants in the registers hurts state pruning already. > > > Adding tracking of constants through stack hurts pruning even more. > > > The later patch address this general constant tracking issue > > > with coarse/precise logic. > > > > > > Signed-off-by: Alexei Starovoitov <ast@xxxxxxxxxx> > > > Acked-by: Andrii Nakryiko <andriin@xxxxxx> > > > --- > > > kernel/bpf/verifier.c | 90 +++++++++++++++++++++++++++++++------------ > > > 1 file changed, 65 insertions(+), 25 deletions(-) > > > > I know these are already in bpf-next sorry it took me awhile to get > > time to review, but looks good to me. Thanks! We had something similar > > in the earlier loop test branch from last year. > > It's not in bpf-next yet :) oops was looking at the wrong branch on my side. > Code reviews are appreciated at any time. > Looks like we were just lucky with older clang. > I haven't tracked which clang version became smarter. > If you haven't seen this issue and haven't changed cilium C source > to workaround that then there is chance you'll hit it as well. > By "new clang" I meant version 9.0 I'll take a look at Cilium sources with version 9.0 > "old clang" is unknown. I just had cilium elf .o around that > I kept using for testing without recompiling them. > Just by chance I recompiled them to see annotated verifier line info > messages with BTF and hit this interesting issue. > See patch 9 backtracking logic that resolves this 'precision of scalar' > issue for progs compiled with both new and old clangs. > working my way through the series now, but for this patch Acked-by: John Fastabend <john.fastabend@xxxxxxxxx>