My coworkers and I have been playing with tracing use cases with bpf and we uncovered some programs we did not expect to pass verification. We believe that there's incorrect logic when in the verifier in the context of the `bpf_loop` and `bpf_for_each_map_elem` helpers (I suspect also `bpf_user_ringbuf_drain`, but I haven't verified it). These helpers are interesting in that they take a callback function, and a context pointer which will be passed into the callback function; these are the helpers that do some form of iterations. The observation is that the verifier doesn't take into account possible changes to the context made by the callback. I'm interested in better understanding how the verifier is supposed to work in these cases, and to collaborate on fixing the underlying problem. Consider the following probes (Note: this will crash the machine if triggered): ```C #include <linux/bpf.h> #include <linux/ptrace.h> #include <bpf/bpf_helpers.h> #include <bpf/bpf_tracing.h> char _license[] SEC("license") = "GPL"; typedef struct context { char* buf; } context_t; static long loop_callback(__u32 idx, context_t* ctx) { if (idx == 0) { ctx->buf = (char*)(0xDEAD); return 0; } if (bpf_probe_read_user(ctx->buf, 8, (void*)(0xBADC0FFEE))) { return 1; } return 0; } SEC("uprobe/bpf_loop_bug") int BPF_KPROBE(bpf_loop_bug, void* foo_ptr) { __u64 buf = 0; context_t context = (context_t) { .buf = (char*)(&buf) }; bpf_loop(100, loop_callback, &context, 0); } ``` As far as I can tell, the verifier only verifies the callback relative to the state of the context value when it's initially passed in. This can problematically allow for pointers to erroneously be considered valid pointers to map data when in reality those pointers can point anywhere. The example probes use this to write arbitrary data into arbitrary memory addresses. >From my reading of the verifier code, it seems that only exactly one iteration of the loop is checked. The important call, as I understand it, is `__check_func_call`[1]. It seems that that logic adjusts the verifier state to move the arguments into the appropriate registers, and then passes verification into the body of the function, but then, when the verifier hits a return, it just goes back to the caller as opposed to verifying the next iteration of the loop. Am I missing something? When it comes to fixing the problem, I don't quite know where to start. Perhaps these iteration callbacks ought to be treated more like global functions -- you can't always make assumptions about the state of the data in the context pointer. Treating the context pointer as totally opaque seems bad from a usability perspective. Maybe there's a way to attempt to verify the function body of the function by treating all or part of the context as read-only, and then if that fails, go back and assume nothing about that part of the context structure. What is the right way to think about plugging this hole? Somewhat tangential, it seems that if I later interact with the data in the context which may have been mutated, I get a rather opaque verifier error about the context type itself: `reg type unsupported for arg#1 function loop_callback#33`. I'd be interested to understand more deeply why this combined version fails verification, and how to improve the error. See the following probe: ```C SEC("uprobe/both") int BPF_KPROBE(bpf_for_each_map_elem_bug, void *foo_ptr) { __u64 buf = 0; context_t context = (context_t) {.buf = (char *)(&buf) }; bpf_for_each_map_elem(&ARRAY, for_each_map_elem_callback, &context, 0); bpf_loop(100, loop_callback, &context, 0); } ``` Just for completeness, here's one using `bpf_for_each_map_elem`: ```C struct { __uint(type, BPF_MAP_TYPE_ARRAY); __type(key, __u32); __type(value, __u64); __uint(max_entries, 100); } ARRAY SEC(".maps"); static long for_each_map_elem_callback( struct bpf_map* map, const __u32* key, __u64* value, context_t* ctx ) { if (*(__u64*)(ctx->buf) == 0) { ctx->buf = (char*)(0xDEAD); return 0; } if (bpf_probe_read_user(ctx->buf, 8, (void*)(0xBADC0FFEE))) { return 1; } return 0; } SEC("uprobe/for_each_map_elem_bug") int BPF_KPROBE(for_each_map_elem_bug, void* foo_ptr) { __u64 buf = 0; context_t context = (context_t) { .buf = (char*)(&buf) }; bpf_for_each_map_elem(&ARRAY, for_each_map_elem_callback, &context, 0); bpf_loop(100, loop_callback, &context, 0); } ``` [1]: https://github.com/torvalds/linux/blob/5133c9e51de41bfa902153888e11add3342ede18/kernel/bpf/verifier.c#L8668