Re: [QUESTION] usage of BPF_MAP_TYPE_RINGBUF

Jiri Olsa <olsajiri@xxxxxxxxx> · Wed, 11 Jan 2023 09:27:50 +0100



On Tue, Jan 10, 2023 at 02:49:59PM +0100, andrea terzolo wrote:
> Hello!
> 
> If I can I would ask a question regarding the BPF_MAP_TYPE_RINGBUF
> map. Looking at the kernel implementation [0] it seems that data pages
> are mapped 2 times to have a more efficient and simpler
> implementation. This seems to be a ring buffer peculiarity, the perf
> buffer didn't have such an implementation. In the Falco project [1] we
> use huge per-CPU buffers to collect almost all the syscalls that the
> system throws and the default size of each buffer is 8 MB. This means
> that using the ring buffer approach on a system with 128 CPUs, we will
> have (128*8*2) MB, while with the perf buffer only (128*8) MB. The

hum IIUC it's not allocated twice but pages are just mapped twice,
to cope with wrap around samples, described in git log:

    One interesting implementation bit, that significantly simplifies (and thus
    speeds up as well) implementation of both producers and consumers is how data
    area is mapped twice contiguously back-to-back in the virtual memory. This
    allows to not take any special measures for samples that have to wrap around
    at the end of the circular buffer data area, because the next page after the
    last data page would be first data page again, and thus the sample will still
    appear completely contiguous in virtual memory. See comment and a simple ASCII
    diagram showing this visually in bpf_ringbuf_area_alloc().

> issue is that this memory requirement could be too much for some
> systems and also in Kubernetes environments where there are strict
> resource limits... Our actual workaround is to use ring buffers shared
> between more than one CPU with a BPF_MAP_TYPE_ARRAY_OF_MAPS, so for
> example we allocate a ring buffer for each CPU pair. Unfortunately,
> this solution has a price since we increase the contention on the ring
> buffers and as highlighted here [2], the presence of multiple
> competing writers on the same buffer could become a real bottleneck...
> Sorry for the long introduction, my question here is, are there any
> other approaches to manage such a scenario? Will there be a
> possibility to use the ring buffer without the kernel double mapping
> in the near future? The ring buffer has such amazing features with
> respect to the perf buffer, but in a scenario like the Falco one,
> where we have aggressive multiple producers, this double mapping could
> become a limitation.

AFAIK the bpf ring buffer can be used across cpus, so you don't need
to have extra copy for each cpu if you don't really want to

jirka

> 
> Thank you in advance for your time,
> Andrea
> 
> 0: https://github.com/torvalds/linux/blob/master/kernel/bpf/ringbuf.c#L107
> 1: https://github.com/falcosecurity/falco
> 2: https://patchwork.ozlabs.org/project/netdev/patch/20200529075424.3139988-5-andriin@xxxxxx/