On Tue, Aug 22, 2023 at 6:49 PM Hou Tao <houtao@xxxxxxxxxxxxxxx> wrote: > > Hi, > > On 8/22/2023 1:15 PM, Andrii Nakryiko wrote: > > On Mon, Jul 24, 2023 at 6:24 AM Andrew Werner <awerner32@xxxxxxxxx> wrote: > >> Before this patch, the producer position could overflow `unsigned > >> long`, in which case libbpf would forever stop processing new writes to > >> the ringbuf. This patch addresses that bug by avoiding ordered > >> comparison between the consumer and producer position. If the consumer > >> position is greater than the producer position, the assumption is that > >> the producer has overflowed. > >> > >> A more defensive check could be to ensure that the delta is within > >> the allowed range, but such defensive checks are neither present in > >> the kernel side code nor in libbpf. The overflow that this patch > >> handles can occur while the producer and consumer follow a correct > >> protocol. > > Yep, great find! > > > >> A selftest was written to demonstrate the bug, and indeed this patch > >> allows the test to continue to make progress past the overflow. > >> However, the author was unable to create a testing environment on a > >> 32-bit machine, and the test requires substantial memory and over 4 > > 2GB of memory for ringbuf, right? Perhaps it would be good to just > > outline the repro, even if we won't have it implemented in selftests. > > Something along the lines of: a) set up ringbuf of 2GB size and > > reserve+commit maximum-sized record (UMAX/4) constantly as fast as > > possible. With 1 million records per second repro time should be about > > 4.7 hours. Can you please update the commit with something like that > > instead of a vague "there is repro, but I won't show it ;)" ? Thanks! > > I think it would be great that the commit message can elaborate about > the repo. Andrew had already posted an external link to the reproducer > in v2 [0]: https://github.com/ajwerner/bpf/commit/85e1240e7713 Yeah, of course, I saw it. I didn't mean that he actually refuses to show repro, it's just that the commit message is vague about it and we should improve that. I think we are on the same page. Including the link to full repro in addition to more detailed description is also good. > > [0]: > https://lore.kernel.org/bpf/20230724132543.1282645-1-awerner32@xxxxxxxxx/ > >> hours to hit the overflow point on a 64-bit machine. Thus, the test > >> is not included in this patch because of the impracticality of running > >> it. > >> > >> Additionally, this patch adds commentary around a separate point to note > >> that the modular arithmetic is valid in the face of overflows, as that > >> fact may not be obvious to future readers. > >> > >> Signed-off-by: Andrew Werner <awerner32@xxxxxxxxx> > >> --- > >> tools/lib/bpf/ringbuf.c | 11 ++++++++++- > >> 1 file changed, 10 insertions(+), 1 deletion(-) > >> > >> diff --git a/tools/lib/bpf/ringbuf.c b/tools/lib/bpf/ringbuf.c > >> index 02199364db13..6271757bc3d2 100644 > >> --- a/tools/lib/bpf/ringbuf.c > >> +++ b/tools/lib/bpf/ringbuf.c > >> @@ -237,7 +237,11 @@ static int64_t ringbuf_process_ring(struct ring *r) > >> do { > >> got_new_data = false; > >> prod_pos = smp_load_acquire(r->producer_pos); > >> - while (cons_pos < prod_pos) { > >> + > >> + /* Avoid signed comparisons between the positions; the producer > > "signed comparisons" is confusing and invalid, as cons_pos and > > prod_pos are unsigned and comparison here is unsigned. What you meant > > is inequality comparison, which is invalid when done naively (like it > > is done in libbpf right now, sigh...), if counters can wrap around. > > > >> + * position can overflow before the consumer position. > >> + */ > >> + while (cons_pos != prod_pos) { > > I'm wondering if we should preserve the "consumer pos is before > > producer pos" check just for clarity's sake (with a comment about > > wrapping around of counters, of course) like: > > > > if ((long)(cons_pos - prod_pos) < 0) ? > > > > BTW, I think kernel code needs fixing as well in > > __bpf_user_ringbuf_peek (we should compare consumer/producer positions > > directly, only through subtraction and casting to signed long as > > above), would you be able to fix it at the same time with libbpf? > > Would be good to also double-check the rest of kernel/bpf/ringbuf.c to > > make sure we don't directly compare positions anywhere else. > > I missed __bpf_user_ringbuf_peek() when reviewed the patch. I also can > help to double-check kernel/bpf/ringbuf.c. great, thanks! > > > >> len_ptr = r->data + (cons_pos & r->mask); > >> len = smp_load_acquire(len_ptr); > >> > >> @@ -498,6 +502,11 @@ void *user_ring_buffer__reserve(struct user_ring_buffer *rb, __u32 size) > >> prod_pos = smp_load_acquire(rb->producer_pos); > >> > >> max_size = rb->mask + 1; > >> + > >> + /* Note that this formulation in the face of overflow of prod_pos > >> + * so long as the delta between prod_pos and cons_pos remains no > >> + * greater than max_size. > >> + */ > >> avail_size = max_size - (prod_pos - cons_pos); > >> /* Round up total size to a multiple of 8. */ > >> total_size = (size + BPF_RINGBUF_HDR_SZ + 7) / 8 * 8; > >> -- > >> 2.39.2 > >> > >> > > . > >