On Mon, Apr 25, 2022 at 05:45:10PM -0700, Andrii Nakryiko wrote: > Teach libbpf to post-process BPF verifier log on BPF program load > failure and detect known error patterns to provide user with more > context. > > Currently there is one such common situation: an "unguarded" failed BPF > CO-RE relocation. While failing CO-RE relocation is expected, it is > expected to be property guarded in BPF code such that BPF verifier > always eliminates BPF instructions corresponding to such failed CO-RE > relos as dead code. In cases when user failed to take such precautions, > BPF verifier provides the best log it can: > > 123: (85) call unknown#195896080 > invalid func unknown#195896080 > > Such incomprehensible log error is due to libbpf "poisoning" BPF > instruction that corresponds to failed CO-RE relocation by replacing it > with invalid `call 0xbad2310` instruction (195896080 == 0xbad2310 reads > "bad relo" if you squint hard enough). > > Luckily, libbpf has all the necessary information to look up CO-RE > relocation that failed and provide more human-readable description of > what's going on: > > 5: <invalid CO-RE relocation> > failed to resolve CO-RE relocation <byte_off> [6] struct task_struct___bad.fake_field_subprog (0:2 @ offset 8) > > This hopefully makes it much easier to understand what's wrong with > user's BPF program without googling magic constants. > > This BPF verifier log fixup is setup to be extensible and is going to be > used for at least one other upcoming feature of libbpf in follow up patches. > Libbpf is parsing lines of BPF verifier log starting from the very end. > Currently it processes up to 10 lines of code looking for familiar > patterns. This avoids wasting lots of CPU processing huge verifier logs > (especially for log_level=2 verbosity level). Actual verification error > should normally be found in last few lines, so this should work > reliably. > > If libbpf needs to expand log beyond available log_buf_size, it > truncates the end of the verifier log. Given verifier log normally ends > with something like: > > processed 2 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0 > > ... truncating this on program load error isn't too bad (end user can > always increase log size, if it needs to get complete log). and it didn't break test_verifier? In do_test_single() it does: proc = strstr(bpf_vlog, "processed "); insn_processed = atoi(proc + 10); if (test->insn_processed != insn_processed) {