On Mon, Mar 11, 2024 at 3:45 PM Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> wrote: > > On Thu, Mar 7, 2024 at 5:08 PM Alexei Starovoitov > <alexei.starovoitov@xxxxxxxxx> wrote: > > > > From: Alexei Starovoitov <ast@xxxxxxxxxx> > > > > v2->v3: > > - contains bpf bits only, but cc-ing past audience for continuity > > - since prerequisite patches landed, this series focus on the main > > functionality of bpf_arena. > > - adopted Andrii's approach to support arena in libbpf. > > - simplified LLVM support. Instead of two instructions it's now only one. > > - switched to cond_break (instead of open coded iters) in selftests > > - implemented several follow-ups that will be sent after this set > > . remember first IP and bpf insn that faulted in arena. > > report to user space via bpftool > > . copy paste and tweak glob_match() aka mini-regex as a selftests/bpf > > - see patch 1 for detailed description of bpf_arena > > > > v1->v2: > > - Improved commit log with reasons for using vmap_pages_range() in arena. > > Thanks to Johannes > > - Added support for __arena global variables in bpf programs > > - Fixed race conditions spotted by Barret > > - Fixed wrap32 issue spotted by Barret > > - Fixed bpf_map_mmap_sz() the way Andrii suggested > > > > The work on bpf_arena was inspired by Barret's work: > > https://github.com/google/ghost-userspace/blob/main/lib/queue.bpf.h > > that implements queues, lists and AVL trees completely as bpf programs > > using giant bpf array map and integer indices instead of pointers. > > bpf_arena is a sparse array that allows to use normal C pointers to > > build such data structures. Last few patches implement page_frag > > allocator, link list and hash table as bpf programs. > > > > v1: > > bpf programs have multiple options to communicate with user space: > > - Various ring buffers (perf, ftrace, bpf): The data is streamed > > unidirectionally from bpf to user space. > > - Hash map: The bpf program populates elements, and user space consumes > > them via bpf syscall. > > - mmap()-ed array map: Libbpf creates an array map that is directly > > accessed by the bpf program and mmap-ed to user space. It's the fastest > > way. Its disadvantage is that memory for the whole array is reserved at > > the start. > > > > Alexei Starovoitov (13): > > bpf: Introduce bpf_arena. > > bpf: Disasm support for addr_space_cast instruction. > > bpf: Add x86-64 JIT support for PROBE_MEM32 pseudo instructions. > > bpf: Add x86-64 JIT support for bpf_addr_space_cast instruction. > > bpf: Recognize addr_space_cast instruction in the verifier. > > bpf: Recognize btf_decl_tag("arg:arena") as PTR_TO_ARENA. > > libbpf: Add __arg_arena to bpf_helpers.h > > libbpf: Add support for bpf_arena. > > bpftool: Recognize arena map type > > bpf: Add helper macro bpf_addr_space_cast() > > selftests/bpf: Add unit tests for bpf_arena_alloc/free_pages > > selftests/bpf: Add bpf_arena_list test. > > selftests/bpf: Add bpf_arena_htab test. > > > > Andrii Nakryiko (1): > > libbpf: Recognize __arena global varaibles. > > > > arch/x86/net/bpf_jit_comp.c | 231 +++++++- > > include/linux/bpf.h | 10 +- > > include/linux/bpf_types.h | 1 + > > include/linux/bpf_verifier.h | 1 + > > include/linux/filter.h | 4 + > > include/uapi/linux/bpf.h | 14 + > > kernel/bpf/Makefile | 3 + > > kernel/bpf/arena.c | 558 ++++++++++++++++++ > > kernel/bpf/btf.c | 19 +- > > kernel/bpf/core.c | 16 + > > kernel/bpf/disasm.c | 10 + > > kernel/bpf/log.c | 3 + > > kernel/bpf/syscall.c | 42 ++ > > kernel/bpf/verifier.c | 123 +++- > > .../bpf/bpftool/Documentation/bpftool-map.rst | 2 +- > > tools/bpf/bpftool/gen.c | 13 + > > tools/bpf/bpftool/map.c | 2 +- > > tools/include/uapi/linux/bpf.h | 14 + > > tools/lib/bpf/bpf_helpers.h | 1 + > > tools/lib/bpf/libbpf.c | 163 ++++- > > tools/lib/bpf/libbpf.h | 2 +- > > tools/lib/bpf/libbpf_probes.c | 7 + > > tools/testing/selftests/bpf/DENYLIST.aarch64 | 2 + > > tools/testing/selftests/bpf/DENYLIST.s390x | 2 + > > tools/testing/selftests/bpf/bpf_arena_alloc.h | 67 +++ > > .../testing/selftests/bpf/bpf_arena_common.h | 70 +++ > > tools/testing/selftests/bpf/bpf_arena_htab.h | 100 ++++ > > tools/testing/selftests/bpf/bpf_arena_list.h | 92 +++ > > .../testing/selftests/bpf/bpf_experimental.h | 43 ++ > > .../selftests/bpf/prog_tests/arena_htab.c | 88 +++ > > .../selftests/bpf/prog_tests/arena_list.c | 68 +++ > > .../selftests/bpf/prog_tests/verifier.c | 2 + > > .../testing/selftests/bpf/progs/arena_htab.c | 48 ++ > > .../selftests/bpf/progs/arena_htab_asm.c | 5 + > > .../testing/selftests/bpf/progs/arena_list.c | 87 +++ > > .../selftests/bpf/progs/verifier_arena.c | 146 +++++ > > tools/testing/selftests/bpf/test_loader.c | 9 +- > > 37 files changed, 2028 insertions(+), 40 deletions(-) > > create mode 100644 kernel/bpf/arena.c > > create mode 100644 tools/testing/selftests/bpf/bpf_arena_alloc.h > > create mode 100644 tools/testing/selftests/bpf/bpf_arena_common.h > > create mode 100644 tools/testing/selftests/bpf/bpf_arena_htab.h > > create mode 100644 tools/testing/selftests/bpf/bpf_arena_list.h > > create mode 100644 tools/testing/selftests/bpf/prog_tests/arena_htab.c > > create mode 100644 tools/testing/selftests/bpf/prog_tests/arena_list.c > > create mode 100644 tools/testing/selftests/bpf/progs/arena_htab.c > > create mode 100644 tools/testing/selftests/bpf/progs/arena_htab_asm.c > > create mode 100644 tools/testing/selftests/bpf/progs/arena_list.c > > create mode 100644 tools/testing/selftests/bpf/progs/verifier_arena.c > > > > -- > > 2.43.0 > > > > Besides a few comments on patch #1 (and maybe one or two potential > corner case issues I mentioned, which can be easily fixed), > the series > looked good. So I've applied patches as is. I fixed typo ("varaibles") > in one of the commit subjects while applying. Thanks! That subj typo survived two months of reviews in v1,v2,v3 and I swear I use ./scripts/checkpatch.pl --codespell all the time. I guess it got drowned in all of the messages like: WARNING: 'mmaped' may be misspelled - perhaps 'mapped'? #356: FILE: tools/lib/bpf/libbpf.c:13666: + *mmaped = map->mmaped; ^^^^^^ WARNING: 'mmaped' may be misspelled - perhaps 'mapped'? #356: FILE: tools/lib/bpf/libbpf.c:13666: + *mmaped = map->mmaped; ^^^^^^ > Also, in one of the selftests you hard-coded PAGE_SIZE to 4096, which > isn't correct on some architectures, so please see how you can make it > not hard-coded (but still work for both bpf and user code). It seemed > minor enough to not delay patches (either way those architectures > don't support ARENA just yet). yes. It's on todo list already. I've added #define PAGE_SIZE 4096 to user space side of bpf selftest, because it's used in bpf_arena_*.h code which is dual compiled as bpf prog (and then it's using PAGE_SIZE from vmlinux.h) and compiled as native code. So bpf side gets correct PAGE_SIZE automatically as a nice constant at compile time, but for user space there is no good PAGE_SIZE constant to use. Just doing #define PAGE_SIZE sysconf(_SC_PAGE_SIZE) produces inefficient code. Hence I left it as a todo to figure out later.