On Wed, Dec 8, 2021 at 11:01 PM Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> wrote: > > On Wed, Dec 8, 2021 at 10:26 PM John Fastabend <john.fastabend@xxxxxxxxx> wrote: > > > > Andrii Nakryiko wrote: > > > To unify libbpf APIs behavior w.r.t. log_buf and log_level, fix > > > bpf_prog_load() to follow the same logic as bpf_btf_load() and > > > high-level bpf_object__load() API will follow in the subsequent patches: > > > - if log_level is 0 and non-NULL log_buf is provided by a user, attempt > > > load operation initially with no log_buf and log_level set; > > > - if successful, we are done, return new FD; > > > - on error, retry the load operation with log_level bumped to 1 and > > > log_buf set; this way verbose logging will be requested only when we > > > are sure that there is a failure, but will be fast in the > > > common/expected success case. > > > > > > Of course, user can still specify log_level > 0 from the very beginning > > > to force log collection. > > > > > > Suggested-by: Alexei Starovoitov <ast@xxxxxxxxxx> > > > Signed-off-by: Andrii Nakryiko <andrii@xxxxxxxxxx> > > > --- > > > > [...] > > > > > @@ -366,16 +368,17 @@ int bpf_prog_load_v0_6_0(enum bpf_prog_type prog_type, > > > goto done; > > > } > > > > > > - if (log_level || !log_buf) > > > - goto done; > > > + if (log_level == 0 && !log_buf) { > > ^^^^^^^^ > > > > with non-Null log buf? Seems comment and above are out of sync? > > > > Should it be, if (log_level == 0 && log_buf) { ... } > > Doh... yeah, it should. Apparently inverting a boolean expression is > non-trivial :) I'll add low-level bpf_prog_load() (and maybe > bpf_btf_load() while at it) log_buf tests to log_buf.c in selftests to > catch something like this better, thanks for catching! I did write selftest and of course there was another bug in bpf_btf_load() (log_level wasn't set to 1 on retry). So yay tests. BTW, if anyone runs into problems like "why the error is returned from bpf() syscall if it shouldn't?", I highly recommend trying retsnoop ([0]) to save lots of time trying to pinpoint what exactly is happening. In this case, just running sudo ./retsnoop -e '*sys_bpf' -a ':kernel/bpf/btf.c' -a ':kernel/bpf/verifier.c' -v -n test_progs --lbr helped to pinpoint exact log_level + log_buf check in btf_parse() that was failing, saving tons of time and making libbpf bug obvious. If you don't know even roughly where the problem is, use `sudo ./retsnoop -c bpf -v` (probably with --lbr to see through function inlining, if your kernel is recent enough), this will attach to tons of functions (1500+) and will take a bit longer to start up, but will give you wider coverage. [0] https://github.com/anakryiko/retsnoop > > > > > > + /* log_level == 0 with non-NULL log_buf requires retrying on error > > > + * with log_level == 1 and log_buf/log_buf_size set, to get details of > > > + * failure > > > + */ > > > + attr.log_buf = ptr_to_u64(log_buf); > > > + attr.log_size = log_size; > > > + attr.log_level = 1; > > > > > > - /* Try again with log */ > > > - log_buf[0] = 0; > > > - attr.log_buf = ptr_to_u64(log_buf); > > > - attr.log_size = log_size; > > > - attr.log_level = 1; > > > - > > > - fd = sys_bpf_prog_load(&attr, sizeof(attr), attempts); > > > + fd = sys_bpf_prog_load(&attr, sizeof(attr), attempts); > > > + } > > > done: > > > /* free() doesn't affect errno, so we don't need to restore it */ > > > free(finfo); > > > -- > > > 2.30.2 > > > > > > >