Re: [PATCH bpf-next 09/10] libbpf: fix up verifier log for unguarded failed CO-RE relos

Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> · Tue, 26 Apr 2022 11:59:38 -0700

On Mon, Apr 25, 2022 at 05:45:10PM -0700, Andrii Nakryiko wrote:
> Teach libbpf to post-process BPF verifier log on BPF program load
> failure and detect known error patterns to provide user with more
> context.
> 
> Currently there is one such common situation: an "unguarded" failed BPF
> CO-RE relocation. While failing CO-RE relocation is expected, it is
> expected to be property guarded in BPF code such that BPF verifier
> always eliminates BPF instructions corresponding to such failed CO-RE
> relos as dead code. In cases when user failed to take such precautions,
> BPF verifier provides the best log it can:
> 
>   123: (85) call unknown#195896080
>   invalid func unknown#195896080
> 
> Such incomprehensible log error is due to libbpf "poisoning" BPF
> instruction that corresponds to failed CO-RE relocation by replacing it
> with invalid `call 0xbad2310` instruction (195896080 == 0xbad2310 reads
> "bad relo" if you squint hard enough).
> 
> Luckily, libbpf has all the necessary information to look up CO-RE
> relocation that failed and provide more human-readable description of
> what's going on:
> 
>   5: <invalid CO-RE relocation>
>   failed to resolve CO-RE relocation <byte_off> [6] struct task_struct___bad.fake_field_subprog (0:2 @ offset 8)
> 
> This hopefully makes it much easier to understand what's wrong with
> user's BPF program without googling magic constants.
> 
> This BPF verifier log fixup is setup to be extensible and is going to be
> used for at least one other upcoming feature of libbpf in follow up patches.
> Libbpf is parsing lines of BPF verifier log starting from the very end.
> Currently it processes up to 10 lines of code looking for familiar
> patterns. This avoids wasting lots of CPU processing huge verifier logs
> (especially for log_level=2 verbosity level). Actual verification error
> should normally be found in last few lines, so this should work
> reliably.
> 
> If libbpf needs to expand log beyond available log_buf_size, it
> truncates the end of the verifier log. Given verifier log normally ends
> with something like:
> 
>   processed 2 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
> 
> ... truncating this on program load error isn't too bad (end user can
> always increase log size, if it needs to get complete log).

and it didn't break test_verifier?
In do_test_single() it does:
  proc = strstr(bpf_vlog, "processed ");
  insn_processed = atoi(proc + 10);
  if (test->insn_processed != insn_processed) {