On Wed, Aug 10, 2022 at 10:18 AM Francis Laniel <flaniel@xxxxxxxxxxxxxxxxxxx> wrote: > > By default, BPF ring buffer are size bounded, when producers already filled the > buffer, they need to wait for the consumer to get those data before adding new > ones. > In terms of API, bpf_ringbuf_reserve() returns NULL if the buffer is full. > > This patch permits making BPF ring buffer overwritable. > When producers already wrote as many data as the buffer size, they will begin to > over write existing data, so the oldest will be replaced. > As a result, bpf_ringbuf_reserve() never returns NULL. > Part of BPF ringbuf record (first 8 bytes) stores information like record size and offset in pages to the beginning of ringbuf map metadata. This is used by consumer to know how much data belongs to data record, but also for making sure that bpf_ringbuf_reserve()/bpf_ringbuf_submit() work correctly and don't corrupt kernel memory. If we simply allow overwriting this information (and no, spinlock doesn't protect from that, you can have multiple producers writing to different parts of ringbuf data area in parallel after "reserving" their respective records), it completely breaks any sort of correctness, both for user-space consumer and kernel-side producers. > Signed-off-by: Francis Laniel <flaniel@xxxxxxxxxxxxxxxxxxx> > --- > include/uapi/linux/bpf.h | 3 +++ > kernel/bpf/ringbuf.c | 51 +++++++++++++++++++++++++++++++--------- > 2 files changed, 43 insertions(+), 11 deletions(-) > [...]