On Thu, Mar 16, 2023 at 11:30:10AM -0700, Andrii Nakryiko wrote: > Currently, if user-supplied log buffer to collect BPF verifier log turns > out to be too small to contain full log, bpf() syscall return -ENOSPC, > fails BPF program verification/load, but preserves first N-1 bytes of > verifier log (where N is the size of user-supplied buffer). > > This is problematic in a bunch of common scenarios, especially when > working with real-world BPF programs that tend to be pretty complex as > far as verification goes and require big log buffers. Typically, it's > when debugging tricky cases at log level 2 (verbose). Also, when BPF program > is successfully validated, log level 2 is the only way to actually see > verifier state progression and all the important details. > > Even with log level 1, it's possible to get -ENOSPC even if the final > verifier log fits in log buffer, if there is a code path that's deep > enough to fill up entire log, even if normally it would be reset later > on (there is a logic to chop off successfully validated portions of BPF > verifier log). > > In short, it's not always possible to pre-size log buffer. Also, in > practice, the end of the log most often is way more important than the > beginning. > > This patch switches BPF verifier log behavior to effectively behave as > rotating log. That is, if user-supplied log buffer turns out to be too > short, instead of failing with -ENOSPC, verifier will keep overwriting > previously written log, effectively treating user's log buffer as a ring > buffer. > > Importantly, though, to preserve good user experience and not require > every user-space application to adopt to this new behavior, before > exiting to user-space verifier will rotate log (in place) to make it > start at the very beginning of user buffer as a continuous > zero-terminated string. The contents will be a chopped off N-1 last > bytes of full verifier log, of course. > > Given beginning of log is sometimes important as well, we add > BPF_LOG_FIXED (which equals 8) flag to force old behavior, which allows > tools like veristat to request first part of verifier log, if necessary. > > On the implementation side, conceptually, it's all simple. We maintain > 64-bit logical start and end positions. If we need to truncate the log, > start position will be adjusted accordingly to lag end position by > N bytes. We then use those logical positions to calculate their matching > actual positions in user buffer and handle wrap around the end of the > buffer properly. Finally, right before returning from bpf_check(), we > rotate user log buffer contents in-place as necessary, to make log > contents contiguous. See comments in relevant functions for details. > > Signed-off-by: Andrii Nakryiko <andrii@xxxxxxxxxx> > --- > include/linux/bpf_verifier.h | 32 ++- > kernel/bpf/log.c | 182 +++++++++++++++++- > kernel/bpf/verifier.c | 17 +- > .../selftests/bpf/prog_tests/log_fixup.c | 1 + > 4 files changed, 209 insertions(+), 23 deletions(-) > > diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h > index 83dff25545ee..cff11c99ed9a 100644 > --- a/include/linux/bpf_verifier.h > +++ b/include/linux/bpf_verifier.h > @@ -491,25 +491,42 @@ struct bpf_insn_aux_data { > #define BPF_VERIFIER_TMP_LOG_SIZE 1024 > > struct bpf_verifier_log { > - u32 level; > - char kbuf[BPF_VERIFIER_TMP_LOG_SIZE]; > + /* Logical start and end positions of a "log window" of the verifier log. > + * start_pos == 0 means we haven't truncated anything. > + * Once truncation starts to happen, start_pos + len_total == end_pos, > + * except during log reset situations, in which (end_pos - start_pos) > + * might get smaller than len_total (see bpf_vlog_reset()). > + * Generally, (end_pos - start_pos) gives number of useful data in > + * user log buffer. > + */ > + u64 start_pos; > + u64 end_pos; ... > > -void bpf_vlog_reset(struct bpf_verifier_log *log, u32 new_pos) > +void bpf_vlog_reset(struct bpf_verifier_log *log, u64 new_pos) > { > char zero = 0; > > if (!bpf_verifier_log_needed(log)) > return; > > - log->len_used = new_pos; > + /* if position to which we reset is beyond current log window, then we > + * didn't preserve any useful content and should adjust adjust > + * start_pos to end up with an empty log (start_pos == end_pos) > + */ > + log->end_pos = new_pos; > + if (log->end_pos < log->start_pos) > + log->start_pos = log->end_pos; > + > if (put_user(zero, log->ubuf + new_pos)) Haven't analyzed the code in details and asking based on comments... new_pos can be bigger than ubuf size and above write will be out of bounds, no?