On 2/7/24 07:34, Donald Hunter wrote:
Use cases: 1. User space mmap-s bpf_arena and uses it as a traditional mmap-ed anonymous region, like memcached or any key/value storage. The bpf program implements an in-kernel accelerator. XDP prog can search for a key in bpf_arena and return a value without going to user space. 2. The bpf program builds arbitrary data structures in bpf_arena (hash tables, rb-trees, sparse arrays), while user space occasionally consumes it. 3. bpf_arena is a "heap" of memory from the bpf program's point of view. It is not shared with user space. Initially, the kernel vm_area and user vma are not populated. User space can fault in pages within the range. While servicing a page fault, bpf_arena logic will insert a new page into the kernel and user vmas. The bpf program can allocate pages from that region via bpf_arena_alloc_pages(). This kernel function will insert pages into the kernel vm_area. The subsequent fault-in from user space will populate that page into the user vma. The BPF_F_SEGV_ON_FAULT flag at arena creation time can be used to prevent fault-in from user space. In such a case, if a page is not allocated by the bpf program and not present in the kernel vm_area, the user process will segfault. This is useful for use cases 2 and 3 above. bpf_arena_alloc_pages() is similar to user space mmap(). It allocates pages either at a specific address within the arena or allocates a range with the maple tree. bpf_arena_free_pages() is analogous to munmap(), which frees pages and removes the range from the kernel vm_area and from user process vmas. bpf_arena can be used as a bpf program "heap" of up to 4GB. The memory is not shared with user space. This is use case 3. In such a case, the BPF_F_NO_USER_CONV flag is recommended. It will tell the verifier to treat the
>
I can see_what_ this flag does but it's not clear what the consequences of this flag are. Perhaps it would be better named BPF_F_NO_USER_ACCESS?
i can see a use for NO_USER_CONV, but also still allowing user access. userspace could mmap the region, but only look at scalars within it. this is similar to what i do today with array maps in my BPF schedulers. that's a little different than Case 3.
if i knew userspace wasn't going to follow pointers, NO_USER_CONV would both be a speedup and make it so i don't have to worry about mmapping to the same virtual address in every process that shares the arena map. though this latter feature isn't in the code. right now you have to have it mmapped at the same user_va in all address spaces. that's not a huge deal for me either way.
barret