On Wed, Jan 11, 2023 at 12:27 AM Jiri Olsa <olsajiri@xxxxxxxxx> wrote: > > On Tue, Jan 10, 2023 at 02:49:59PM +0100, andrea terzolo wrote: > > Hello! > > > > If I can I would ask a question regarding the BPF_MAP_TYPE_RINGBUF > > map. Looking at the kernel implementation [0] it seems that data pages > > are mapped 2 times to have a more efficient and simpler > > implementation. This seems to be a ring buffer peculiarity, the perf > > buffer didn't have such an implementation. In the Falco project [1] we > > use huge per-CPU buffers to collect almost all the syscalls that the > > system throws and the default size of each buffer is 8 MB. This means > > that using the ring buffer approach on a system with 128 CPUs, we will > > have (128*8*2) MB, while with the perf buffer only (128*8) MB. The > > hum IIUC it's not allocated twice but pages are just mapped twice, > to cope with wrap around samples, described in git log: > > One interesting implementation bit, that significantly simplifies (and thus > speeds up as well) implementation of both producers and consumers is how data > area is mapped twice contiguously back-to-back in the virtual memory. This > allows to not take any special measures for samples that have to wrap around > at the end of the circular buffer data area, because the next page after the > last data page would be first data page again, and thus the sample will still > appear completely contiguous in virtual memory. See comment and a simple ASCII > diagram showing this visually in bpf_ringbuf_area_alloc(). yes, exactly, there is no duplication of memory, it's just mapped twice to make working with records that wrap around simple and efficient > > > issue is that this memory requirement could be too much for some > > systems and also in Kubernetes environments where there are strict > > resource limits... Our actual workaround is to use ring buffers shared > > between more than one CPU with a BPF_MAP_TYPE_ARRAY_OF_MAPS, so for > > example we allocate a ring buffer for each CPU pair. Unfortunately, > > this solution has a price since we increase the contention on the ring > > buffers and as highlighted here [2], the presence of multiple > > competing writers on the same buffer could become a real bottleneck... > > Sorry for the long introduction, my question here is, are there any > > other approaches to manage such a scenario? Will there be a > > possibility to use the ring buffer without the kernel double mapping > > in the near future? The ring buffer has such amazing features with > > respect to the perf buffer, but in a scenario like the Falco one, > > where we have aggressive multiple producers, this double mapping could > > become a limitation. > > AFAIK the bpf ring buffer can be used across cpus, so you don't need > to have extra copy for each cpu if you don't really want to > seems like they do share, but only between CPUs. But nothing prevents you from sharing between more than 2 CPUs, right? It's a tradeoff between contention and overall memory usage (but as pointed out, ringbuf doesn't use 2x more memory). Do you actually see a lot of contention when sharing ringbuf between 2 CPUs? There are multiple applications that share a single ringbuf between all CPUs, and no one really complained about high contention so far. You'd need to push tons of data non-stop, probably, at which point I'd worry about consumers not being able to keep up (and definitely not doing much useful with all this data). But YMMV, of course. > jirka > > > > > Thank you in advance for your time, > > Andrea > > > > 0: https://github.com/torvalds/linux/blob/master/kernel/bpf/ringbuf.c#L107 > > 1: https://github.com/falcosecurity/falco > > 2: https://patchwork.ozlabs.org/project/netdev/patch/20200529075424.3139988-5-andriin@xxxxxx/