On Fri, Dec 13, 2024 at 03:02:11PM GMT, Andrii Nakryiko wrote: > On Thu, Dec 12, 2024 at 3:23 PM Daniel Xu <dxu@xxxxxxxxx> wrote: > > > > This commit allows progs to elide a null check on statically known map > > lookup keys. In other words, if the verifier can statically prove that > > the lookup will be in-bounds, allow the prog to drop the null check. > > > > This is useful for two reasons: > > > > 1. Large numbers of nullness checks (especially when they cannot fail) > > unnecessarily pushes prog towards BPF_COMPLEXITY_LIMIT_JMP_SEQ. > > 2. It forms a tighter contract between programmer and verifier. > > > > For (1), bpftrace is starting to make heavier use of percpu scratch > > maps. As a result, for user scripts with large number of unrolled loops, > > we are starting to hit jump complexity verification errors. These > > percpu lookups cannot fail anyways, as we only use static key values. > > Eliding nullness probably results in less work for verifier as well. > > > > For (2), percpu scratch maps are often used as a larger stack, as the > > currrent stack is limited to 512 bytes. In these situations, it is > > desirable for the programmer to express: "this lookup should never fail, > > and if it does, it means I messed up the code". By omitting the null > > check, the programmer can "ask" the verifier to double check the logic. > > > > Tests also have to be updated in sync with these changes, as the > > verifier is more efficient with this change. Notable, iters.c tests had > > to be changed to use a map type that still requires null checks, as it's > > exercising verifier tracking logic w.r.t iterators. > > > > Signed-off-by: Daniel Xu <dxu@xxxxxxxxx> > > --- > > kernel/bpf/verifier.c | 80 ++++++++++++++++++- > > tools/testing/selftests/bpf/progs/iters.c | 14 ++-- > > .../selftests/bpf/progs/map_kptr_fail.c | 2 +- > > .../selftests/bpf/progs/verifier_map_in_map.c | 2 +- > > .../testing/selftests/bpf/verifier/map_kptr.c | 2 +- > > 5 files changed, 87 insertions(+), 13 deletions(-) > > > > Eduard has great points. I've added a few more comments below. > > pw-bot: cr > > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > > index 58b36cc96bd5..4947ef884a18 100644 > > --- a/kernel/bpf/verifier.c > > +++ b/kernel/bpf/verifier.c > > @@ -287,6 +287,7 @@ struct bpf_call_arg_meta { > > u32 ret_btf_id; > > u32 subprogno; > > struct btf_field *kptr_field; > > + s64 const_map_key; > > }; > > > > struct bpf_kfunc_call_arg_meta { > > @@ -9163,6 +9164,53 @@ static int check_reg_const_str(struct bpf_verifier_env *env, > > return 0; > > } > > > > +/* Returns constant key value if possible, else -1 */ > > +static s64 get_constant_map_key(struct bpf_verifier_env *env, > > + struct bpf_reg_state *key, > > + u32 key_size) > > +{ > > + struct bpf_func_state *state = func(env, key); > > + struct bpf_reg_state *reg; > > + int zero_size = 0; > > + int stack_off; > > + u8 *stype; > > + int slot; > > + int spi; > > + int i; > > + > > + if (!env->bpf_capable) > > + return -1; > > + if (key->type != PTR_TO_STACK) > > + return -1; > > + if (!tnum_is_const(key->var_off)) > > + return -1; > > + > > + stack_off = key->off + key->var_off.value; > > + slot = -stack_off - 1; > > + spi = slot / BPF_REG_SIZE; > > + > > + /* First handle precisely tracked STACK_ZERO, up to BPF_REG_SIZE */ > > + stype = state->stack[spi].slot_type; > > + for (i = 0; i < BPF_REG_SIZE && stype[i] == STACK_ZERO; i++) > > it's Friday and I'm lazy, but please double-check that this works for > both big-endian and little-endian :) Any tips? Are the existing tests running thru s390x hosts in CI sufficient or should I add some tests writen in C (and not BPF assembler)? I can never think about endianness correctly... > > with Eduard's suggestion this also becomes interesting when you have > 000mmm mix (as one example), because that gives you a small range, and > all values might be valid keys for arrays Can you define what "small range" means? What range is there with 0's? Any pointers would be helpful. > > > + zero_size++; > > + if (zero_size == key_size) > > + return 0; > > + > > + if (!is_spilled_reg(&state->stack[spi])) > > + /* Not pointer to stack */ > > !is_spilled_reg and "Not pointer to stack" seem to be not exactly the > same things? You're right - comment is not helpful. I'll make the change to use is_spilled_scalar_reg() which is probably as clear as it gets. [..]