Re: [PATCH] libbpf: handle producer position overflow

Hou Tao <houtao@xxxxxxxxxxxxxxx> · Wed, 23 Aug 2023 09:48:59 +0800

Hi,

On 8/22/2023 1:15 PM, Andrii Nakryiko wrote:
> On Mon, Jul 24, 2023 at 6:24 AM Andrew Werner <awerner32@xxxxxxxxx> wrote:
>> Before this patch, the producer position could overflow `unsigned
>> long`, in which case libbpf would forever stop processing new writes to
>> the ringbuf. This patch addresses that bug by avoiding ordered
>> comparison between the consumer and producer position. If the consumer
>> position is greater than the producer position, the assumption is that
>> the producer has overflowed.
>>
>> A more defensive check could be to ensure that the delta is within
>> the allowed range, but such defensive checks are neither present in
>> the kernel side code nor in libbpf. The overflow that this patch
>> handles can occur while the producer and consumer follow a correct
>> protocol.
> Yep, great find!
>
>> A selftest was written to demonstrate the bug, and indeed this patch
>> allows the test to continue to make progress past the overflow.
>> However, the author was unable to create a testing environment on a
>> 32-bit machine, and the test requires substantial memory and over 4
> 2GB of memory for ringbuf, right? Perhaps it would be good to just
> outline the repro, even if we won't have it implemented in selftests.
> Something along the lines of: a) set up ringbuf of 2GB size and
> reserve+commit maximum-sized record (UMAX/4) constantly as fast as
> possible. With 1 million records per second repro time should be about
> 4.7 hours. Can you please update the commit with something like that
> instead of a vague "there is repro, but I won't show it ;)" ? Thanks!

I think it would be great that the commit message can elaborate about
the repo. Andrew had already posted an external link to the reproducer
in v2 [0]: https://github.com/ajwerner/bpf/commit/85e1240e7713

[0]:
https://lore.kernel.org/bpf/20230724132543.1282645-1-awerner32@xxxxxxxxx/
>> hours to hit the overflow point on a 64-bit machine. Thus, the test
>> is not included in this patch because of the impracticality of running
>> it.
>>
>> Additionally, this patch adds commentary around a separate point to note
>> that the modular arithmetic is valid in the face of overflows, as that
>> fact may not be obvious to future readers.
>>
>> Signed-off-by: Andrew Werner <awerner32@xxxxxxxxx>
>> ---
>>  tools/lib/bpf/ringbuf.c | 11 ++++++++++-
>>  1 file changed, 10 insertions(+), 1 deletion(-)
>>
>> diff --git a/tools/lib/bpf/ringbuf.c b/tools/lib/bpf/ringbuf.c
>> index 02199364db13..6271757bc3d2 100644
>> --- a/tools/lib/bpf/ringbuf.c
>> +++ b/tools/lib/bpf/ringbuf.c
>> @@ -237,7 +237,11 @@ static int64_t ringbuf_process_ring(struct ring *r)
>>         do {
>>                 got_new_data = false;
>>                 prod_pos = smp_load_acquire(r->producer_pos);
>> -               while (cons_pos < prod_pos) {
>> +
>> +               /* Avoid signed comparisons between the positions; the producer
> "signed comparisons" is confusing and invalid, as cons_pos and
> prod_pos are unsigned and comparison here is unsigned. What you meant
> is inequality comparison, which is invalid when done naively (like it
> is done in libbpf right now, sigh...), if counters can wrap around.
>
>> +                * position can overflow before the consumer position.
>> +                */
>> +               while (cons_pos != prod_pos) {
> I'm wondering if we should preserve the "consumer pos is before
> producer pos" check just for clarity's sake (with a comment about
> wrapping around of counters, of course) like:
>
> if ((long)(cons_pos - prod_pos) < 0) ?
>
> BTW, I think kernel code needs fixing as well in
> __bpf_user_ringbuf_peek (we should compare consumer/producer positions
> directly, only through subtraction and casting to signed long as
> above), would you be able to fix it at the same time with libbpf?
> Would be good to also double-check the rest of kernel/bpf/ringbuf.c to
> make sure we don't directly compare positions anywhere else.

I missed __bpf_user_ringbuf_peek() when reviewed the patch. I also can
help to double-check kernel/bpf/ringbuf.c.
>
>>                         len_ptr = r->data + (cons_pos & r->mask);
>>                         len = smp_load_acquire(len_ptr);
>>
>> @@ -498,6 +502,11 @@ void *user_ring_buffer__reserve(struct user_ring_buffer *rb, __u32 size)
>>         prod_pos = smp_load_acquire(rb->producer_pos);
>>
>>         max_size = rb->mask + 1;
>> +
>> +       /* Note that this formulation in the face of overflow of prod_pos
>> +        * so long as the delta between prod_pos and cons_pos remains no
>> +        * greater than max_size.
>> +        */
>>         avail_size = max_size - (prod_pos - cons_pos);
>>         /* Round up total size to a multiple of 8. */
>>         total_size = (size + BPF_RINGBUF_HDR_SZ + 7) / 8 * 8;
>> --
>> 2.39.2
>>
>>
> .