Hi, On 8/22/2023 1:15 PM, Andrii Nakryiko wrote: > On Mon, Jul 24, 2023 at 6:24 AM Andrew Werner <awerner32@xxxxxxxxx> wrote: >> Before this patch, the producer position could overflow `unsigned >> long`, in which case libbpf would forever stop processing new writes to >> the ringbuf. This patch addresses that bug by avoiding ordered >> comparison between the consumer and producer position. If the consumer >> position is greater than the producer position, the assumption is that >> the producer has overflowed. >> >> A more defensive check could be to ensure that the delta is within >> the allowed range, but such defensive checks are neither present in >> the kernel side code nor in libbpf. The overflow that this patch >> handles can occur while the producer and consumer follow a correct >> protocol. > Yep, great find! > >> A selftest was written to demonstrate the bug, and indeed this patch >> allows the test to continue to make progress past the overflow. >> However, the author was unable to create a testing environment on a >> 32-bit machine, and the test requires substantial memory and over 4 > 2GB of memory for ringbuf, right? Perhaps it would be good to just > outline the repro, even if we won't have it implemented in selftests. > Something along the lines of: a) set up ringbuf of 2GB size and > reserve+commit maximum-sized record (UMAX/4) constantly as fast as > possible. With 1 million records per second repro time should be about > 4.7 hours. Can you please update the commit with something like that > instead of a vague "there is repro, but I won't show it ;)" ? Thanks! I think it would be great that the commit message can elaborate about the repo. Andrew had already posted an external link to the reproducer in v2 [0]: https://github.com/ajwerner/bpf/commit/85e1240e7713 [0]: https://lore.kernel.org/bpf/20230724132543.1282645-1-awerner32@xxxxxxxxx/ >> hours to hit the overflow point on a 64-bit machine. Thus, the test >> is not included in this patch because of the impracticality of running >> it. >> >> Additionally, this patch adds commentary around a separate point to note >> that the modular arithmetic is valid in the face of overflows, as that >> fact may not be obvious to future readers. >> >> Signed-off-by: Andrew Werner <awerner32@xxxxxxxxx> >> --- >> tools/lib/bpf/ringbuf.c | 11 ++++++++++- >> 1 file changed, 10 insertions(+), 1 deletion(-) >> >> diff --git a/tools/lib/bpf/ringbuf.c b/tools/lib/bpf/ringbuf.c >> index 02199364db13..6271757bc3d2 100644 >> --- a/tools/lib/bpf/ringbuf.c >> +++ b/tools/lib/bpf/ringbuf.c >> @@ -237,7 +237,11 @@ static int64_t ringbuf_process_ring(struct ring *r) >> do { >> got_new_data = false; >> prod_pos = smp_load_acquire(r->producer_pos); >> - while (cons_pos < prod_pos) { >> + >> + /* Avoid signed comparisons between the positions; the producer > "signed comparisons" is confusing and invalid, as cons_pos and > prod_pos are unsigned and comparison here is unsigned. What you meant > is inequality comparison, which is invalid when done naively (like it > is done in libbpf right now, sigh...), if counters can wrap around. > >> + * position can overflow before the consumer position. >> + */ >> + while (cons_pos != prod_pos) { > I'm wondering if we should preserve the "consumer pos is before > producer pos" check just for clarity's sake (with a comment about > wrapping around of counters, of course) like: > > if ((long)(cons_pos - prod_pos) < 0) ? > > BTW, I think kernel code needs fixing as well in > __bpf_user_ringbuf_peek (we should compare consumer/producer positions > directly, only through subtraction and casting to signed long as > above), would you be able to fix it at the same time with libbpf? > Would be good to also double-check the rest of kernel/bpf/ringbuf.c to > make sure we don't directly compare positions anywhere else. I missed __bpf_user_ringbuf_peek() when reviewed the patch. I also can help to double-check kernel/bpf/ringbuf.c. > >> len_ptr = r->data + (cons_pos & r->mask); >> len = smp_load_acquire(len_ptr); >> >> @@ -498,6 +502,11 @@ void *user_ring_buffer__reserve(struct user_ring_buffer *rb, __u32 size) >> prod_pos = smp_load_acquire(rb->producer_pos); >> >> max_size = rb->mask + 1; >> + >> + /* Note that this formulation in the face of overflow of prod_pos >> + * so long as the delta between prod_pos and cons_pos remains no >> + * greater than max_size. >> + */ >> avail_size = max_size - (prod_pos - cons_pos); >> /* Round up total size to a multiple of 8. */ >> total_size = (size + BPF_RINGBUF_HDR_SZ + 7) / 8 * 8; >> -- >> 2.39.2 >> >> > .