On Thu, 2023-01-05 at 11:06 +0100, Jose E. Marchesi wrote: > > On Sat, Dec 31, 2022 at 8:31 AM Eduard Zingerman <eddyz87@xxxxxxxxx> wrote: > > > > > > BPF has two documented (non-atomic) memory store instructions: > > > > > > BPF_STX: *(size *) (dst_reg + off) = src_reg > > > BPF_ST : *(size *) (dst_reg + off) = imm32 > > > > > > Currently LLVM BPF back-end does not emit BPF_ST instruction and does > > > not allow one to be specified as inline assembly. > > > > > > Recently I've been exploring ways to port some of the verifier test > > > cases from tools/testing/selftests/bpf/verifier/*.c to use inline assembly > > > and machinery provided in tools/testing/selftests/bpf/test_loader.c > > > (which should hopefully simplify tests maintenance). > > > The BPF_ST instruction is popular in these tests: used in 52 of 94 files. > > > > > > While it is possible to adjust LLVM to only support BPF_ST for inline > > > assembly blocks it seems a bit wasteful. This patch-set contains a set > > > of changes to verifier necessary in case when LLVM is allowed to > > > freely emit BPF_ST instructions (source code is available here [1]). > > > > Would we gate LLVM's emitting of BPF_ST for C code behind some new > > cpu=v4? What is the benefit for compiler to start automatically emit > > such instructions? Such thinking about logistics, if there isn't much > > benefit, as BPF application owner I wouldn't bother enabling this > > behavior risking regressions on old kernels that don't have these > > changes. > > Hmm, GCC happily generates BPF_ST instructions: > > $ echo 'int v; void foo () { v = 666; }' | bpf-unknown-none-gcc -O2 -xc -S -o foo.s - > $ cat foo.s > .file "<stdin>" > .text > .align 3 > .global foo > .type foo, @function > foo: > lddw %r0,v > stw [%r0+0],666 > exit > .size foo, .-foo > .global v > .type v, @object > .lcomm v,4,4 > .ident "GCC: (GNU) 12.0.0 20211206 (experimental)" > > Been doing that since October 2019, I think before the cpu versioning > mechanism was got in place? > > We weren't aware this was problematic. Does the verifier reject such > instructions? Interesting, do BPF selftests generated by GCC pass the same way they do if generated by clang? I had to do the following changes to the verifier to make the selftests pass when BPF_ST instruction is allowed for selection: - patch #1 in this patchset: track values of constants written to stack using BPF_ST. Currently these are tracked imprecisely, unlike the writes using BPF_STX, e.g.: fp[-8] = 42; currently verifier assumes that fp[-8]=mmmmmmmm after such instruction, where m stands for "misc", just a note that something is written at fp[-8]. r1 = 42; verifier tracks r1=42 after this instruction. fp[-8] = r1; verifier tracks fp[-8]=42 after this instruction. So the patch makes both cases equivalent. - patch #3 in this patchset: adjusts verifier.c:convert_ctx_access() to operate on BPF_ST alongside BPF_STX. Context parameters for some BPF programs types are "fake" data structures. The verifier matches all BPF_STX and BPF_LDX instructions that operate on pointers to such contexts and rewrites these instructions. It might change an offset or add another layer of indirection, etc. E.g. see filter.c:bpf_convert_ctx_access(). (This also implies that verifier forbids writes to non-constant offsets inside such structures). So the patch extends this logic to also handle BPF_ST. > > > So I feel like the biggest benefit is to be able to use this > > instruction in embedded assembly, to make writing and maintaining > > tests easier. > > > > > The changes include: > > > - update to verifier.c:check_stack_write_*() functions to track > > > constant values spilled to stack via BPF_ST instruction in a same > > > way stack spills of known registers by BPF_STX are tracked; > > > - updates to verifier.c:convert_ctx_access() and it's callbacks to > > > handle BPF_ST instruction in a way similar to BPF_STX; > > > - some test adjustments and a few new tests. > > > > > > With the above changes applied and LLVM from [1] all test_verifier, > > > test_maps, test_progs and test_progs-no_alu32 test cases are passing. > > > > > > When built using the LLVM version from [1] BPF programs generated for > > > selftests and Cilium programs (taken from [2]) see certain reduction > > > in size, e.g. below are total numbers of instructions for files with > > > over 5K instructions: > > > > > > File Insns Insns Insns Diff > > > w/o with diff pct > > > BPF_ST BPF_ST > > > cilium/bpf_host.o 44620 43774 -846 -1.90% > > > cilium/bpf_lxc.o 36842 36060 -782 -2.12% > > > cilium/bpf_overlay.o 23557 23186 -371 -1.57% > > > cilium/bpf_xdp.o 26397 25931 -466 -1.77% > > > selftests/core_kern.bpf.o 12359 12359 0 0.00% > > > selftests/linked_list_fail.bpf.o 5501 5302 -199 -3.62% > > > selftests/profiler1.bpf.o 17828 17709 -119 -0.67% > > > selftests/pyperf100.bpf.o 49793 49268 -525 -1.05% > > > selftests/pyperf180.bpf.o 88738 87813 -925 -1.04% > > > selftests/pyperf50.bpf.o 25388 25113 -275 -1.08% > > > selftests/pyperf600.bpf.o 78330 78300 -30 -0.04% > > > selftests/pyperf_global.bpf.o 5244 5188 -56 -1.07% > > > selftests/pyperf_subprogs.bpf.o 5262 5192 -70 -1.33% > > > selftests/strobemeta.bpf.o 17154 16065 -1089 -6.35% > > > selftests/test_verif_scale2.bpf.o 11337 11337 0 0.00% > > > > > > (Instructions are counted by counting the number of instruction lines > > > in disassembly). > > > > > > Is community interested in this work? > > > Are there any omissions in my changes to the verifier? > > > > > > Known issue: > > > > > > There are two programs (one Cilium, one selftest) that exhibit > > > anomalous increase in verifier processing time with this patch-set: > > > > > > File Program Insns (A) Insns (B) Insns (DIFF) > > > ------------------- ----------------------------- --------- --------- -------------- > > > bpf_host.o tail_ipv6_host_policy_ingress 1576 2403 +827 (+52.47%) > > > map_kptr.bpf.o test_map_kptr 400 475 +75 (+18.75%) > > > ------------------- ----------------------------- --------- --------- -------------- > > > > > > These are under investigation. > > > > > > Thanks, > > > Eduard > > > > > > [1] https://reviews.llvm.org/D140804 > > > [2] git@xxxxxxxxxx:anakryiko/cilium.git > > > > > > Eduard Zingerman (5): > > > bpf: more precise stack write reasoning for BPF_ST instruction > > > selftests/bpf: check if verifier tracks constants spilled by > > > BPF_ST_MEM > > > bpf: allow ctx writes using BPF_ST_MEM instruction > > > selftests/bpf: test if pointer type is tracked for BPF_ST_MEM > > > selftests/bpf: don't match exact insn index in expected error message > > > > > > kernel/bpf/cgroup.c | 49 +++++--- > > > kernel/bpf/verifier.c | 102 +++++++++------- > > > net/core/filter.c | 72 ++++++------ > > > .../selftests/bpf/prog_tests/log_fixup.c | 2 +- > > > .../selftests/bpf/prog_tests/spin_lock.c | 8 +- > > > .../bpf/verifier/bounds_mix_sign_unsign.c | 110 ++++++++++-------- > > > .../selftests/bpf/verifier/bpf_st_mem.c | 29 +++++ > > > tools/testing/selftests/bpf/verifier/ctx.c | 11 -- > > > tools/testing/selftests/bpf/verifier/unpriv.c | 23 ++++ > > > 9 files changed, 249 insertions(+), 157 deletions(-) > > > create mode 100644 tools/testing/selftests/bpf/verifier/bpf_st_mem.c > > > > > > -- > > > 2.39.0 > > >