Re: [PATCH bpf 2/4] libbpf: Handle size overflow for ringbuf mmap

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On 11/12/2022 4:56 AM, Andrii Nakryiko wrote:
> On Fri, Nov 11, 2022 at 9:54 AM <sdf@xxxxxxxxxx> wrote:
>> On 11/11, Hou Tao wrote:
>>> From: Hou Tao <houtao1@xxxxxxxxxx>
>>> The maximum size of ringbuf is 2GB on x86-64 host, so 2 * max_entries
>>> will overflow u32 when mapping producer page and data pages. Only
>>> casting max_entries to size_t is not enough, because for 32-bits
>>> application on 64-bits kernel the size of read-only mmap region
>>> also could overflow size_t.
>>> Fixes: bf99c936f947 ("libbpf: Add BPF ring buffer support")
>>> Signed-off-by: Hou Tao <houtao1@xxxxxxxxxx>
>>> ---
>>>   tools/lib/bpf/ringbuf.c | 11 +++++++++--
>>>   1 file changed, 9 insertions(+), 2 deletions(-)
>>> diff --git a/tools/lib/bpf/ringbuf.c b/tools/lib/bpf/ringbuf.c
>>> index d285171d4b69..c4bdc88af672 100644
>>> --- a/tools/lib/bpf/ringbuf.c
>>> +++ b/tools/lib/bpf/ringbuf.c
>>> @@ -77,6 +77,7 @@ int ring_buffer__add(struct ring_buffer *rb, int map_fd,
>>>       __u32 len = sizeof(info);
>>>       struct epoll_event *e;
>>>       struct ring *r;
>>> +     __u64 ro_size;
> I found ro_size quite a confusing name, let's call it mmap_sz?
OK.
>
>>>       void *tmp;
>>>       int err;
>>> @@ -129,8 +130,14 @@ int ring_buffer__add(struct ring_buffer *rb, int
>>> map_fd,
>>>        * data size to allow simple reading of samples that wrap around the
>>>        * end of a ring buffer. See kernel implementation for details.
>>>        * */
>>> -     tmp = mmap(NULL, rb->page_size + 2 * info.max_entries, PROT_READ,
>>> -                MAP_SHARED, map_fd, rb->page_size);
>>> +     ro_size = rb->page_size + 2 * (__u64)info.max_entries;
>> [..]
>>
>>> +     if (ro_size != (__u64)(size_t)ro_size) {
>>> +             pr_warn("ringbuf: ring buffer size (%u) is too big\n",
>>> +                     info.max_entries);
>>> +             return libbpf_err(-E2BIG);
>>> +     }
>> Why do we need this check at all? IIUC, the problem is that the expression
>> "rb->page_size + 2 * info.max_entries" is evaluated as u32 and can
>> overflow. So why doing this part only isn't enough?
>>
>> size_t mmap_size = rb->page_size + 2 * (size_t)info.max_entries;
>> mmap(NULL, mmap_size, PROT_READ, MAP_SHARED, map_fd, ...);
>>
>> sizeof(size_t) should be 8, so no overflow is possible?
> not on 32-bit arches, presumably?
Yes. For 32-bits kernel, the total size of virtual address space for user space
and kernel space is 4GB, so when map_entries is 2GB, the needed virtual address
space will be 2GB + 4GB, so the mapping of ring buffer will fail either in
kernel or in userspace. A extreme case is 32-bits userspace under 64-bits
kernel. The mapping of 2GB ring buffer in kernel is OK, but 4GB will overflow
size_t on 32-bits userspace.
>


>
>
>>
>>> +     tmp = mmap(NULL, (size_t)ro_size, PROT_READ, MAP_SHARED, map_fd,
>>> +                rb->page_size);
> should we split this mmap into two mmaps -- one for producer_pos page,
> another for data area. That will presumably allow to mmap ringbuf with
> max_entries = 1GB?
I don't understand the reason for the splitting. Even without the splitting, in
theory ring buffer with max_entries = 1GB will be OK for 32-bits kernel, despite
in practice the mapping of 1GB ring buffer on 32-bits kernel will fail because
the most common size of kernel virtual address space is 1GB (although ARM could
use VMSPLIT_1G to increase the size of kernel virtual address to 3GB).
>
>>>       if (tmp == MAP_FAILED) {
>>>               err = -errno;
>>>               ringbuf_unmap_ring(rb, r);
>>> --
>>> 2.29.2




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux