On Mon, Jun 17, 2019 at 11:58 AM Alexei Starovoitov <ast@xxxxxx> wrote: > > On 6/17/19 9:39 AM, Andrii Nakryiko wrote: > > On Sat, Jun 15, 2019 at 12:12 PM Alexei Starovoitov <ast@xxxxxxxxxx> wrote: > >> > >> v2->v3: fixed issues in backtracking pointed out by Andrii. > >> The next step is to add a lot more tests for backtracking. > >> > > > > Tests would be great, verifier complexity is at the level, where it's > > very easy to miss issues. > > > > Was fuzzying approach ever discussed for BPF verifier? I.e., have a > > fuzzer to generate both legal and illegal random small programs. Then > > re-implement verifier as user-level program with straightforward > > recursive exhaustive verification (so no state pruning logic, no > > precise/coarse, etc, just register/stack state tracking) of all > > possible branches. If kernel verifier's verdict differs from > > user-level verifier's verdict - flag that as a test case and figure > > out why they differ. Obviously that would work well only for small > > programs, but that should be a good first step already. > > > > In addition, if this is done, that user-land verifier can be a HUGE > > help to BPF application developers, as libbpf would (potentially) be > > able to generate better error messages using it as well. > > In theory that sounds good, but doesn't work in practice. > The kernel verifier keeps changing faster than user space can catch up. > It's also relying on loaded maps and all sorts of callbacks that > check context, allowed helpers, maps, combinations of them from all > over the kernel. > The last effort to build kernel verifier as-is into .o and link > with kmalloc/map wrappers in user space was here: > https://github.com/iovisor/bpf-fuzzer > It was fuzzing the verifier and was able to find few minor bugs. > But it quickly bit rotted. > > Folks brought up in the past the idea to collect user space > verifiers from different kernels, so that user space tooling can > check whether particular program will load on a set of kernels > without need to run them in VMs. > Even if such feature existed today it won't really solve this production > headache, since all kernels prior to today will not be covered. > > I think syzbot is still generating bpf programs. iirc it found > one bug in the past in the verifier core. > I think the only way to make verifier more robust is to keep > adding new test cases manually. > Most interesting bugs we found by humans. > > Another approach to 'better error message' that was considered > in the past was to teach llvm to recognize things that verifier > will reject and let llvm warn on them. > But it's also not practical. We had llvm error on calls. > Then we added them to the verifier and had to change llvm. > If we had llvm error on loops, now we'd need to change it. > imo it's better to let llvm handle everything. That all makes sense. Thanks for elaborate explanation!