On Tue, Apr 27, 2021 at 11:34 PM Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> wrote: > > On Tue, Apr 27, 2021 at 10:09 AM Brendan Jackman <jackmanb@xxxxxxxxxx> wrote: > > > > One of our benchmarks running in (Google-internal) CI pushes data > > through the ringbuf faster than userspace is able to consume > > it. In this case it seems we're actually able to get >INT_MAX entries > > in a single ringbuf_buffer__consume call. ASAN detected that cnt > > overflows in this case. > > > > Fix by just setting a limit on the number of entries that can be > > consumed. > > > > Fixes: bf99c936f947 (libbpf: Add BPF ring buffer support) > > Signed-off-by: Brendan Jackman <jackmanb@xxxxxxxxxx> > > --- > > tools/lib/bpf/ringbuf.c | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/tools/lib/bpf/ringbuf.c b/tools/lib/bpf/ringbuf.c > > index e7a8d847161f..445a21df0934 100644 > > --- a/tools/lib/bpf/ringbuf.c > > +++ b/tools/lib/bpf/ringbuf.c > > @@ -213,8 +213,8 @@ static int ringbuf_process_ring(struct ring* r) > > do { > > got_new_data = false; > > prod_pos = smp_load_acquire(r->producer_pos); > > - while (cons_pos < prod_pos) { > > + /* Don't read more than INT_MAX, or the return vale won't make sense. */ > > + while (cons_pos < prod_pos && cnt < INT_MAX) { > > ring_buffer__pool() is assumed to not return until all the enqueued > messages are consumed. That's the requirement for the "adaptive" > notification scheme to work properly. So this will break that and > cause the next ring_buffer__pool() to never wake up. > > We could use __u64 internally and then cap it to INT_MAX on return > maybe? But honestly, this sounds like an artificial corner case, if > you are producing data faster than you can consume it and it goes > beyond INT_MAX, something is seriously broken in your application and Disclaimer: I don't know what Brendan's benchmark is actually doing That said, I have seen similar boundaries being reached when doing process monitoring and then a kernel gets compiled (esp. with ccache) and generates a large amount of process events in a very short span of time. Another example is when someone runs a short process in a tight while loop. I agree it's a matter of tuning, but since these corner cases can be easily triggered even on real (non CI) systems no matter how much one tunes, I wouldn't really call it artificial :) - KP > you have more important things to handle :) > > > len_ptr = r->data + (cons_pos & r->mask); > > len = smp_load_acquire(len_ptr); > > > > -- > > 2.31.1.498.g6c1eba8ee3d-goog > >