On Tue, Apr 26, 2022 at 11:59 AM Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote: > > On Mon, Apr 25, 2022 at 05:45:10PM -0700, Andrii Nakryiko wrote: > > Teach libbpf to post-process BPF verifier log on BPF program load > > failure and detect known error patterns to provide user with more > > context. > > > > Currently there is one such common situation: an "unguarded" failed BPF > > CO-RE relocation. While failing CO-RE relocation is expected, it is > > expected to be property guarded in BPF code such that BPF verifier > > always eliminates BPF instructions corresponding to such failed CO-RE > > relos as dead code. In cases when user failed to take such precautions, > > BPF verifier provides the best log it can: > > > > 123: (85) call unknown#195896080 > > invalid func unknown#195896080 > > > > Such incomprehensible log error is due to libbpf "poisoning" BPF > > instruction that corresponds to failed CO-RE relocation by replacing it > > with invalid `call 0xbad2310` instruction (195896080 == 0xbad2310 reads > > "bad relo" if you squint hard enough). > > > > Luckily, libbpf has all the necessary information to look up CO-RE > > relocation that failed and provide more human-readable description of > > what's going on: > > > > 5: <invalid CO-RE relocation> > > failed to resolve CO-RE relocation <byte_off> [6] struct task_struct___bad.fake_field_subprog (0:2 @ offset 8) > > > > This hopefully makes it much easier to understand what's wrong with > > user's BPF program without googling magic constants. > > > > This BPF verifier log fixup is setup to be extensible and is going to be > > used for at least one other upcoming feature of libbpf in follow up patches. > > Libbpf is parsing lines of BPF verifier log starting from the very end. > > Currently it processes up to 10 lines of code looking for familiar > > patterns. This avoids wasting lots of CPU processing huge verifier logs > > (especially for log_level=2 verbosity level). Actual verification error > > should normally be found in last few lines, so this should work > > reliably. > > > > If libbpf needs to expand log beyond available log_buf_size, it > > truncates the end of the verifier log. Given verifier log normally ends > > with something like: > > > > processed 2 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0 > > > > ... truncating this on program load error isn't too bad (end user can > > always increase log size, if it needs to get complete log). > > and it didn't break test_verifier? > In do_test_single() it does: > proc = strstr(bpf_vlog, "processed "); > insn_processed = atoi(proc + 10); > if (test->insn_processed != insn_processed) { I forgot to check test_verifier locally, but it's fine according to CI ([0]). This truncation can only happen if libbpf fixes up verifier log, which currently happens only when there is CO-RE relocation failure. I don't think we have any CO-RE relocation failure tests in test_verifier itself. For all other case there will be absolutely no change in verifier log output. [0] https://github.com/kernel-patches/bpf/runs/6181657272?check_suite_focus=true