Re: [PATCH bpf 2/4] libbpf: Handle size overflow for ringbuf mmap

Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> · Mon, 14 Nov 2022 11:51:02 -0800

On Fri, Nov 11, 2022 at 7:34 PM Hou Tao <houtao@xxxxxxxxxxxxxxx> wrote:
>
> Hi,
>
> On 11/12/2022 4:56 AM, Andrii Nakryiko wrote:
> > On Fri, Nov 11, 2022 at 9:54 AM <sdf@xxxxxxxxxx> wrote:
> >> On 11/11, Hou Tao wrote:
> >>> From: Hou Tao <houtao1@xxxxxxxxxx>
> >>> The maximum size of ringbuf is 2GB on x86-64 host, so 2 * max_entries
> >>> will overflow u32 when mapping producer page and data pages. Only
> >>> casting max_entries to size_t is not enough, because for 32-bits
> >>> application on 64-bits kernel the size of read-only mmap region
> >>> also could overflow size_t.
> >>> Fixes: bf99c936f947 ("libbpf: Add BPF ring buffer support")
> >>> Signed-off-by: Hou Tao <houtao1@xxxxxxxxxx>
> >>> ---
> >>>   tools/lib/bpf/ringbuf.c | 11 +++++++++--
> >>>   1 file changed, 9 insertions(+), 2 deletions(-)
> >>> diff --git a/tools/lib/bpf/ringbuf.c b/tools/lib/bpf/ringbuf.c
> >>> index d285171d4b69..c4bdc88af672 100644
> >>> --- a/tools/lib/bpf/ringbuf.c
> >>> +++ b/tools/lib/bpf/ringbuf.c
> >>> @@ -77,6 +77,7 @@ int ring_buffer__add(struct ring_buffer *rb, int map_fd,
> >>>       __u32 len = sizeof(info);
> >>>       struct epoll_event *e;
> >>>       struct ring *r;
> >>> +     __u64 ro_size;
> > I found ro_size quite a confusing name, let's call it mmap_sz?
> OK.
> >
> >>>       void *tmp;
> >>>       int err;
> >>> @@ -129,8 +130,14 @@ int ring_buffer__add(struct ring_buffer *rb, int
> >>> map_fd,
> >>>        * data size to allow simple reading of samples that wrap around the
> >>>        * end of a ring buffer. See kernel implementation for details.
> >>>        * */
> >>> -     tmp = mmap(NULL, rb->page_size + 2 * info.max_entries, PROT_READ,
> >>> -                MAP_SHARED, map_fd, rb->page_size);
> >>> +     ro_size = rb->page_size + 2 * (__u64)info.max_entries;
> >> [..]
> >>
> >>> +     if (ro_size != (__u64)(size_t)ro_size) {
> >>> +             pr_warn("ringbuf: ring buffer size (%u) is too big\n",
> >>> +                     info.max_entries);
> >>> +             return libbpf_err(-E2BIG);
> >>> +     }
> >> Why do we need this check at all? IIUC, the problem is that the expression
> >> "rb->page_size + 2 * info.max_entries" is evaluated as u32 and can
> >> overflow. So why doing this part only isn't enough?
> >>
> >> size_t mmap_size = rb->page_size + 2 * (size_t)info.max_entries;
> >> mmap(NULL, mmap_size, PROT_READ, MAP_SHARED, map_fd, ...);
> >>
> >> sizeof(size_t) should be 8, so no overflow is possible?
> > not on 32-bit arches, presumably?
> Yes. For 32-bits kernel, the total size of virtual address space for user space
> and kernel space is 4GB, so when map_entries is 2GB, the needed virtual address
> space will be 2GB + 4GB, so the mapping of ring buffer will fail either in
> kernel or in userspace. A extreme case is 32-bits userspace under 64-bits
> kernel. The mapping of 2GB ring buffer in kernel is OK, but 4GB will overflow
> size_t on 32-bits userspace.
> >
>
>
> >
> >
> >>
> >>> +     tmp = mmap(NULL, (size_t)ro_size, PROT_READ, MAP_SHARED, map_fd,
> >>> +                rb->page_size);
> > should we split this mmap into two mmaps -- one for producer_pos page,
> > another for data area. That will presumably allow to mmap ringbuf with
> > max_entries = 1GB?
> I don't understand the reason for the splitting. Even without the splitting, in
> theory ring buffer with max_entries = 1GB will be OK for 32-bits kernel, despite
> in practice the mapping of 1GB ring buffer on 32-bits kernel will fail because
> the most common size of kernel virtual address space is 1GB (although ARM could
> use VMSPLIT_1G to increase the size of kernel virtual address to 3GB).

Yep, never mind. size_t is positive, so it can express up to 4GB, so
2GB + 4KB is fine as is already (even though it most probably will
fail).

> >
> >>>       if (tmp == MAP_FAILED) {
> >>>               err = -errno;
> >>>               ringbuf_unmap_ring(rb, r);
> >>> --
> >>> 2.29.2
>