On Wed, 2025-01-29 at 10:45 -0700, Daniel Xu wrote: > On Wed, Jan 29, 2025 at 09:49:12AM -0700, Daniel Xu wrote: > > Hi Ilya, > > > > On Wed, Jan 29, 2025 at 03:58:54PM +0100, Ilya Leoshkevich wrote: > > > On Tue, 2025-01-14 at 13:28 -0700, Daniel Xu wrote: > > > > This commit allows progs to elide a null check on statically > > > > known > > > > map > > > > lookup keys. In other words, if the verifier can statically > > > > prove > > > > that > > > > the lookup will be in-bounds, allow the prog to drop the null > > > > check. > > > > > > > > This is useful for two reasons: > > > > > > > > 1. Large numbers of nullness checks (especially when they > > > > cannot > > > > fail) > > > > unnecessarily pushes prog towards > > > > BPF_COMPLEXITY_LIMIT_JMP_SEQ. > > > > 2. It forms a tighter contract between programmer and verifier. > > > > > > > > For (1), bpftrace is starting to make heavier use of percpu > > > > scratch > > > > maps. As a result, for user scripts with large number of > > > > unrolled > > > > loops, > > > > we are starting to hit jump complexity verification errors. > > > > These > > > > percpu lookups cannot fail anyways, as we only use static key > > > > values. > > > > Eliding nullness probably results in less work for verifier as > > > > well. > > > > > > > > For (2), percpu scratch maps are often used as a larger stack, > > > > as the > > > > currrent stack is limited to 512 bytes. In these situations, it > > > > is > > > > desirable for the programmer to express: "this lookup should > > > > never > > > > fail, > > > > and if it does, it means I messed up the code". By omitting the > > > > null > > > > check, the programmer can "ask" the verifier to double check > > > > the > > > > logic. > > > > > > > > Tests also have to be updated in sync with these changes, as > > > > the > > > > verifier is more efficient with this change. Notable, iters.c > > > > tests > > > > had > > > > to be changed to use a map type that still requires null > > > > checks, as > > > > it's > > > > exercising verifier tracking logic w.r.t iterators. > > > > > > > > Signed-off-by: Daniel Xu <dxu@xxxxxxxxx> > > > > --- > > > > kernel/bpf/verifier.c | 92 > > > > ++++++++++++++++++- > > > > tools/testing/selftests/bpf/progs/iters.c | 14 +-- > > > > .../selftests/bpf/progs/map_kptr_fail.c | 2 +- > > > > .../selftests/bpf/progs/verifier_map_in_map.c | 2 +- > > > > .../testing/selftests/bpf/verifier/map_kptr.c | 2 +- > > > > 5 files changed, 99 insertions(+), 13 deletions(-) > > > > > > [...] > > > > > > > @@ -9158,6 +9216,7 @@ static int check_func_arg(struct > > > > bpf_verifier_env *env, u32 arg, > > > > enum bpf_arg_type arg_type = fn->arg_type[arg]; > > > > enum bpf_reg_type type = reg->type; > > > > u32 *arg_btf_id = NULL; > > > > + u32 key_size; > > > > int err = 0; > > > > > > > > if (arg_type == ARG_DONTCARE) > > > > @@ -9291,8 +9350,13 @@ static int check_func_arg(struct > > > > bpf_verifier_env *env, u32 arg, > > > > verbose(env, "invalid map_ptr to > > > > access map- > > > > > key\n"); > > > > return -EACCES; > > > > } > > > > - err = check_helper_mem_access(env, regno, > > > > meta- > > > > > map_ptr->key_size, > > > > - BPF_READ, false, > > > > NULL); > > > > + key_size = meta->map_ptr->key_size; > > > > + err = check_helper_mem_access(env, regno, > > > > key_size, > > > > BPF_READ, false, NULL); > > > > + if (err) > > > > + return err; > > > > + meta->const_map_key = > > > > get_constant_map_key(env, reg, > > > > key_size); > > > > + if (meta->const_map_key < 0 && meta- > > > > >const_map_key > > > > != -EOPNOTSUPP) > > > > + return meta->const_map_key; > > > > > > Mark Hartmayer reported a problem that after this commit the > > > verifier > > > started refusing to load libvirt's virCgroupV2DevicesLoadProg(), > > > which > > > contains the following snippet: > > > > > > 53: (b7) r1 = -1 ; R1_w=-1 > > > 54: (7b) *(u64 *)(r10 -8) = r1 ; R1_w=-1 R10=fp0 fp-8_w=-1 > > > 55: (bf) r2 = r10 ; R2_w=fp0 R10=fp0 > > > 56: (07) r2 += -8 ; R2_w=fp-8 > > > 57: (18) r1 = 0x9553c800 ; R1_w=map_ptr(ks=8,vs=4) > > > 59: (85) call bpf_map_lookup_elem#1 > > > > > > IIUC here the actual constant value is -1, which this code > > > confuses > > > with an error. > > > > Thanks for reporting. I think I know what the issue is - will send > > a > > patch shortly. > > > > Daniel > > > > I cribbed the source from [0] and tested before and after. I think > this > should work. Mind giving it a try? > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > index 9971c03adfd5..e9176a5ce215 100644 > --- a/kernel/bpf/verifier.c > +++ b/kernel/bpf/verifier.c > @@ -9206,6 +9206,8 @@ static s64 get_constant_map_key(struct > bpf_verifier_env *env, > return reg->var_off.value; > } > > +static bool can_elide_value_nullness(enum bpf_map_type type); > + > static int check_func_arg(struct bpf_verifier_env *env, u32 arg, > struct bpf_call_arg_meta *meta, > const struct bpf_func_proto *fn, > @@ -9354,9 +9356,11 @@ static int check_func_arg(struct > bpf_verifier_env *env, u32 arg, > err = check_helper_mem_access(env, regno, key_size, > BPF_READ, false, NULL); > if (err) > return err; > - meta->const_map_key = get_constant_map_key(env, reg, > key_size); > - if (meta->const_map_key < 0 && meta->const_map_key != > -EOPNOTSUPP) > - return meta->const_map_key; > + if (can_elide_value_nullness(meta->map_ptr- > >map_type)) { > + meta->const_map_key = > get_constant_map_key(env, reg, key_size); > + if (meta->const_map_key < 0 && meta- > >const_map_key != -EOPNOTSUPP) > + return meta->const_map_key; > + } > break; > case ARG_PTR_TO_MAP_VALUE: > if (type_may_be_null(arg_type) && > register_is_null(reg)) > > Thanks, > Daniel > > > [0]: > https://github.com/libvirt/libvirt/blob/c1166be3475a0269f5164d87fec6227d6cb34b47/src/util/vircgroupv2devices.c#L66-L135 Thanks, I tried this in isolation and it fixed the issue for me. I talked to Mark and he will try it with his libvirt scenario. The code looks reasonable to me, but I have a small concern regarding what will happen if the BPF code uses a -EOPNOTSUPP immediate with an array. Unlike other immediates, IIUC this will cause check_func_arg() to return 0. Is there a reason to have this special?