Re: [PATCH] libbpf: handle producer position overflow

Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> · Wed, 23 Aug 2023 09:52:28 -0700

On Tue, Aug 22, 2023 at 6:49 PM Hou Tao <houtao@xxxxxxxxxxxxxxx> wrote:
>
> Hi,
>
> On 8/22/2023 1:15 PM, Andrii Nakryiko wrote:
> > On Mon, Jul 24, 2023 at 6:24 AM Andrew Werner <awerner32@xxxxxxxxx> wrote:
> >> Before this patch, the producer position could overflow `unsigned
> >> long`, in which case libbpf would forever stop processing new writes to
> >> the ringbuf. This patch addresses that bug by avoiding ordered
> >> comparison between the consumer and producer position. If the consumer
> >> position is greater than the producer position, the assumption is that
> >> the producer has overflowed.
> >>
> >> A more defensive check could be to ensure that the delta is within
> >> the allowed range, but such defensive checks are neither present in
> >> the kernel side code nor in libbpf. The overflow that this patch
> >> handles can occur while the producer and consumer follow a correct
> >> protocol.
> > Yep, great find!
> >
> >> A selftest was written to demonstrate the bug, and indeed this patch
> >> allows the test to continue to make progress past the overflow.
> >> However, the author was unable to create a testing environment on a
> >> 32-bit machine, and the test requires substantial memory and over 4
> > 2GB of memory for ringbuf, right? Perhaps it would be good to just
> > outline the repro, even if we won't have it implemented in selftests.
> > Something along the lines of: a) set up ringbuf of 2GB size and
> > reserve+commit maximum-sized record (UMAX/4) constantly as fast as
> > possible. With 1 million records per second repro time should be about
> > 4.7 hours. Can you please update the commit with something like that
> > instead of a vague "there is repro, but I won't show it ;)" ? Thanks!
>
> I think it would be great that the commit message can elaborate about
> the repo. Andrew had already posted an external link to the reproducer
> in v2 [0]: https://github.com/ajwerner/bpf/commit/85e1240e7713

Yeah, of course, I saw it. I didn't mean that he actually refuses to
show repro, it's just that the commit message is vague about it and we
should improve that. I think we are on the same page. Including the
link to full repro in addition to more detailed description is also
good.

>
> [0]:
> https://lore.kernel.org/bpf/20230724132543.1282645-1-awerner32@xxxxxxxxx/
> >> hours to hit the overflow point on a 64-bit machine. Thus, the test
> >> is not included in this patch because of the impracticality of running
> >> it.
> >>
> >> Additionally, this patch adds commentary around a separate point to note
> >> that the modular arithmetic is valid in the face of overflows, as that
> >> fact may not be obvious to future readers.
> >>
> >> Signed-off-by: Andrew Werner <awerner32@xxxxxxxxx>
> >> ---
> >>  tools/lib/bpf/ringbuf.c | 11 ++++++++++-
> >>  1 file changed, 10 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/tools/lib/bpf/ringbuf.c b/tools/lib/bpf/ringbuf.c
> >> index 02199364db13..6271757bc3d2 100644
> >> --- a/tools/lib/bpf/ringbuf.c
> >> +++ b/tools/lib/bpf/ringbuf.c
> >> @@ -237,7 +237,11 @@ static int64_t ringbuf_process_ring(struct ring *r)
> >>         do {
> >>                 got_new_data = false;
> >>                 prod_pos = smp_load_acquire(r->producer_pos);
> >> -               while (cons_pos < prod_pos) {
> >> +
> >> +               /* Avoid signed comparisons between the positions; the producer
> > "signed comparisons" is confusing and invalid, as cons_pos and
> > prod_pos are unsigned and comparison here is unsigned. What you meant
> > is inequality comparison, which is invalid when done naively (like it
> > is done in libbpf right now, sigh...), if counters can wrap around.
> >
> >> +                * position can overflow before the consumer position.
> >> +                */
> >> +               while (cons_pos != prod_pos) {
> > I'm wondering if we should preserve the "consumer pos is before
> > producer pos" check just for clarity's sake (with a comment about
> > wrapping around of counters, of course) like:
> >
> > if ((long)(cons_pos - prod_pos) < 0) ?
> >
> > BTW, I think kernel code needs fixing as well in
> > __bpf_user_ringbuf_peek (we should compare consumer/producer positions
> > directly, only through subtraction and casting to signed long as
> > above), would you be able to fix it at the same time with libbpf?
> > Would be good to also double-check the rest of kernel/bpf/ringbuf.c to
> > make sure we don't directly compare positions anywhere else.
>
> I missed __bpf_user_ringbuf_peek() when reviewed the patch. I also can
> help to double-check kernel/bpf/ringbuf.c.

great, thanks!

> >
> >>                         len_ptr = r->data + (cons_pos & r->mask);
> >>                         len = smp_load_acquire(len_ptr);
> >>
> >> @@ -498,6 +502,11 @@ void *user_ring_buffer__reserve(struct user_ring_buffer *rb, __u32 size)
> >>         prod_pos = smp_load_acquire(rb->producer_pos);
> >>
> >>         max_size = rb->mask + 1;
> >> +
> >> +       /* Note that this formulation in the face of overflow of prod_pos
> >> +        * so long as the delta between prod_pos and cons_pos remains no
> >> +        * greater than max_size.
> >> +        */
> >>         avail_size = max_size - (prod_pos - cons_pos);
> >>         /* Round up total size to a multiple of 8. */
> >>         total_size = (size + BPF_RINGBUF_HDR_SZ + 7) / 8 * 8;
> >> --
> >> 2.39.2
> >>
> >>
> > .
>
>