We discussed this topic today at office hour. As I mentioned, I don't know the ideal solution, but here is something that has enough flexibility for real-world uses, while giving the performance and convenience of reserve/commit API. Ignore naming, we can bikeshed that later. So what we can do is introduce a new bpf_ringbuf_reserve() variant: bpf_ringbuf_reserve_extra(void *ringbuf, __u64 size, __u64 flags, void *extra, __u64 extra_sz); The idea is that we reserve a fixed size amount of data that can be used like it is today for filling a fixed-sized metadata/sample directly. But the real size of the reserved sample is (size + extra_sz), and bpf_ringbuf_reserve_extra() helper will bpf_probe_read (kernel or user, depending on flags) data from extra and put it right after the fixed-size part. So the use would be something like: struct my_meta *m = bpf_ringbuf_reserve_extra(&rb, sizeof(*m), BPF_RB_PROBE_USER, env_vars, 1024); if (!m) /* too bad, either probe_read_user failed or ringbuf is full */ return 1; m->my_field1 = 123; m->my_field2 = 321; So the main problem with this is that when probe_read fails, we fail reservation completely(internally we'd just discard ringbuf sample). Is that OK? Or is it better to still reserve fixed-sized part and zero-out the variable-length part? We are combining two separate operations into a single API, so error handling is more convoluted. Now, the main use case requested was to be able to fetch an array of zero-terminated strings. I honestly don't think it's possible to implement this efficiently without two copies of string data. Mostly because to just determine the size of the string you have to read it one extra time. And you'd probably want to copy string data into some controlled storage first, so that you don't end up reading it once successfully and then failing to read it on the second try. Next, when you have multiple strings, how do you deal with partial failures? It's even worse in terms of error handling and error propagation than the fixed extra size variant described above. Ignoring all that, let's say we'd implement bpf_ringbuf_reserve_extra_strs() helper, that would somehow be copying multiple zero-terminated strings after the fixed-size prefix. Think about implementation. Just to determine the total size of the ringbuf sample, you'd need to read all strings once, and probably also copy them locally. Then you'd reserve a ringbuf sample and copy all that for the second time. So it's as inefficient as a BPF program constructing a single block of memory by reading all such strings manually into a per-CPU array and then using the above bpf_ringbuf_reserve_extra(). But offloading that preparation to a BPF program bypasses all these error handling and memory layout questions. It will be up to a BPF program itself. From a kernel perspective, we just append a block of memory with known (at runtime) size. As a more restricted version of bpf_ringbuf_reserve_extra(), instead of allowing reading arbitrary kernel or user-space memory in bpf_ringbuf_reserve_extra() we can say that it has to be known and initialized memory (like MAP_VALUE pointer), so helper knows that it can just copy data directly. Thoughts? -- Andrii