On 09.02.24 05:05, Alexei Starovoitov wrote:
From: Alexei Starovoitov <ast@xxxxxxxxxx> v1->v2: - Improved commit log with reasons for using vmap_pages_range() in bpf_arena. Thanks to Johannes - Added support for __arena global variables in bpf programs - Fixed race conditions spotted by Barret - Fixed wrap32 issue spotted by Barret - Fixed bpf_map_mmap_sz() the way Andrii suggested The work on bpf_arena was inspired by Barret's work: https://github.com/google/ghost-userspace/blob/main/lib/queue.bpf.h that implements queues, lists and AVL trees completely as bpf programs using giant bpf array map and integer indices instead of pointers. bpf_arena is a sparse array that allows to use normal C pointers to build such data structures. Last few patches implement page_frag allocator, link list and hash table as bpf programs. v1: bpf programs have multiple options to communicate with user space: - Various ring buffers (perf, ftrace, bpf): The data is streamed unidirectionally from bpf to user space. - Hash map: The bpf program populates elements, and user space consumes them via bpf syscall. - mmap()-ed array map: Libbpf creates an array map that is directly accessed by the bpf program and mmap-ed to user space. It's the fastest way. Its disadvantage is that memory for the whole array is reserved at the start. Introduce bpf_arena, which is a sparse shared memory region between the bpf program and user space. Use cases: 1. User space mmap-s bpf_arena and uses it as a traditional mmap-ed anonymous region, like memcached or any key/value storage. The bpf program implements an in-kernel accelerator. XDP prog can search for a key in bpf_arena and return a value without going to user space.
Just so I understand it correctly: this is all backed by unmovable and unswappable memory.
Is there any (existing?) way to restrict/cap the memory consumption via this interface? How easy is this to access+use by unprivileged userspace?
arena_vm_fault() seems to allocate new pages simply via alloc_page(GFP_KERNEL | __GFP_ZERO); No memory accounting, mlock limit checks etc.
We certainly don't want each and every application to be able to break page compaction, swapping etc, that's why I am asking.
-- Cheers, David / dhildenb