On Wed, Apr 3, 2024 at 11:59 AM Brennan Vincent <brennan.vincent@xxxxxxxxxxxxxxxx> wrote: > > Hello, > > I am working on Parca, a continuous CPU time profiler which works by > doing DWARF-based stack unwinding in an eBPF program. In 6.8 series > kernels, our unwinder can no longer be loaded, as it exceeds the > verifier's limit of 1M instructions processed. > > I was eventually able to get it working, so this is not exactly a > support request, but rather a report of an issue that might affect other > people. > > The issue is 100% reproducible and bisects to > 67420501e8681ae18f9f0ea0a69cd2f432100e70. Hacking the kernel a bit to > increase the instructions processed limit, I found that it successfully > verifies after ~3M instructions, whereas previously it took ~250k, so > more than a 10x regression. > > In case it matters, I'm on arm64. > > I am not an expert on the internal details of the verifier, but I have a > vague theory on what can be causing this behavior. Our unwinder has a > few places where it needs to do binary search on an array, which look > similar to this: > > #define MAX_UNWIND_INFO_BINARY_SEARCH_DEPTH 19 > #define MAX_UNWIND_TABLE_SIZE 250 * 1000 > > static u64 find_offset_for_pc(stack_unwind_table_t *table, u64 pc, u64 left, u64 right) { > u64 found = BINARY_SEARCH_DEFAULT; > > for (int i = 0; i < MAX_UNWIND_INFO_BINARY_SEARCH_DEPTH; i++) { > if (left >= right) { > LOG("\t.done"); > return found; > } > > u32 mid = (left + right) / 2; > > // Appease the verifier. > if (mid < 0 || mid >= MAX_UNWIND_TABLE_SIZE) { > bump_unwind_error_should_never_happen(); > return BINARY_SEARCH_SHOULD_NEVER_HAPPEN; > } > > if (table->rows[mid].pc <= pc) { > found = mid; > left = mid + 1; > } else { > right = mid; > } > } > return BINARY_SEARCH_EXHAUSTED_ITERATIONS; > } > > (For more examples see > https://github.com/parca-dev/parca-agent/blob/main/bpf/unwinders/native.bpf.c > .) > > Note that as I said, I'm not an expert on the verifier, so my theory > could be totally wrong. Please assume I put qualifiers like "I > think...", "I suspect...", etc. in the proper places in what > follows. > > The commit in question made the verifier track the bounds of variables > more precisely, which might allow it to verify more programs. This, > however, has a downside which is that it causes an explosion in the > state space. In previous versions of the kernel, we could just verify > this snippet once because the verifier did not precisely track the > bounds of `mid`. In the new version, we have different bounds for `mid` > in the different branches of the main `if` statement, and the set of > different bounds multiplies in size by 2 each time through the loop, > causing this huge explosion. > > Now, if I intentionally confuse the verifier and prevent it from knowing > the bounds of `mid`, e.g.: > > static u32 __attribute__((optnone)) scramble(u32 val) { > return val ^ 0xFFFFFFFF; > } > > then, in the main binary search loop: > > u32 mid = (left + right) / 2; > mid = scramble(scramble(mid)); > > then the program verifies fine, as it did before the commit in question. Yep, I think your analysis is correct, and your obfuscation workaround is "sane" :) Having said that, I think it would benefit your BPF programs from thinking a bit more carefully about scaling your BPF verification process by using various features and techniques that keep verification complexity more bounded. Specifically, I think you can benefit from making find_offset_for_pc a global function, in which case it will be validated just once, separately, and then not re-validated. Also, if you can rely on that, using bpf_for() loops (based on open-coded iterators) or bpf_loop() helper would also help minimize verification complexity of your loops. > > I am not sure what, if anything, should be done about this in the kernel, > as it seems there's a fundamental tradeoff between precision of the > verifier and performance in terms of instructions processed. > > -- > Brennan (he/him) > Staff Software Engineer | Polar Signals Inc. > E: brennan.vincent@xxxxxxxxxxxxxxxx > W: umanwizard.com > TZ: America/New_York