On Mon, Aug 29, 2022 at 7:50 AM Daniel Borkmann <daniel@xxxxxxxxxxxxx> wrote: > > On 8/28/22 1:53 AM, KP Singh wrote: > > On Sat, Aug 27, 2022 at 1:15 AM Andrii Nakryiko <andrii@xxxxxxxxxx> wrote: > >> > >> Add a small tool, veristat, that allows mass-verification of > >> a set of *libbpf-compatible* BPF ELF object files. For each such object > >> file, veristat will attempt to verify each BPF program *individually*. > >> Regardless of success or failure, it parses BPF verifier stats and > >> outputs them in human-readable table format. In the future we can also > >> add CSV and JSON output for more scriptable post-processing, if necessary. > >> > >> veristat allows to specify a set of stats that should be output and > >> ordering between multiple objects and files (e.g., so that one can > >> easily order by total instructions processed, instead of default file > >> name, prog name, verdict, total instructions order). > >> > >> This tool should be useful for validating various BPF verifier changes > >> or even validating different kernel versions for regressions. > > > > Cool stuff! > > +1, out of curiosity, did you try with different kernels to see the deltas? Nope, not yet, barely got the code to the current state before leaving on vacation. But I thought about using this to track regressions and improvements over time as we make changes to BPF verifier. I was thinking to have not just table human-readable output, but also csv and/or json, so that we can build some sort of automation to run this periodically (or even in BPF CI for each patch set) and yell about significant changes. veristat changes are easy, but someone will need to build this sort of automation. There are projects like rustc (Rust compiler) that have this sort of thing very nicely formalized, we might want to do that for Clang + BPF verifier changes as well. > > > I think this would be useful for cases beyond these (i.e. for users to get > > stats about the verifier in general) and it's worth thinking if this should > > be built into bpftool? > > > >> > >> Here's an example for some of the heaviest selftests/bpf BPF object > >> files: > >> > >> $ sudo ./veristat -s insns,file,prog {pyperf,loop,test_verif_scale,strobemeta,test_cls_redirect,profiler}*.linked3.o > >> File Program Verdict Duration, us Total insns Total states Peak states > >> ------------------------------------ ------------------------------------ ------- ------------ ----------- ------------ ----------- > >> loop3.linked3.o while_true failure 350990 1000001 9663 9663 > > > > [...] > > nit: Looks like CI on gcc is bailing: > > https://github.com/kernel-patches/bpf/runs/8072477251?check_suite_focus=true > > [...] > INSTALL /tmp/work/bpf/bpf/tools/testing/selftests/bpf/tools/include/bpf/skel_internal.h > In file included from /tmp/work/bpf/bpf/tools/testing/selftests/bpf/tools/include/bpf/libbpf.h:20, > INSTALL /tmp/work/bpf/bpf/tools/testing/selftests/bpf/tools/include/bpf/libbpf_version.h > from veristat.c:17: > /tmp/work/bpf/bpf/tools/testing/selftests/bpf/tools/include/bpf/libbpf_common.h:13:10: fatal error: libbpf_version.h: No such file or directory > 13 | #include "libbpf_version.h" > | ^~~~~~~~~~~~~~~~~~ hm... Makefile dependencies not correct? I'll check and fix. > compilation terminated. > INSTALL /tmp/work/bpf/bpf/tools/testing/selftests/bpf/tools/include/bpf/usdt.bpf.h > HOSTCC /tmp/work/bpf/bpf/tools/testing/selftests/bpf/tools/build/libbpf/fixdep.o > make: *** [Makefile:165: /tmp/work/bpf/bpf/tools/testing/selftests/bpf/veristat.o] Error 1 > make: *** Waiting for unfinished jobs.... > > I wonder, to detect regressions in pruning behavior, could we add a test_progs subtest to load > selected obj files and compare before/after 'verified insns' numbers? The workflow probably > makes this a bit hard to run with a kernel before this change, but maybe it could be a starting > point where we have a checked-in file containing current numbers, and e.g. if the new change > crosses a threshold of current +10% then the test could fail? > Heh, wrote above before reading this. But yes, if add CSV output and add some sort of baseline upload of last stats, we should be able to do the diff and yell about major regressions. I also want to add test_progs's test/subtest glob selection logic for object/program combo so that we can narrow down list of objects and progs within them to test. Most of programs are trivial and just pollute output, so having shorter list is better. And then we can check in representative list into selftests/bpf, just like we do with DENYLIST/ALLOWLIST for BPF CI. But you know, baby steps :) > Thanks, > Daniel