On Thu, Jan 30, 2025, at 7:41 PM, Daniel Xu wrote: > Hi Ilya, > > On Thu, Jan 30, 2025, at 2:06 AM, Ilya Leoshkevich wrote: >> On Wed, 2025-01-29 at 10:45 -0700, Daniel Xu wrote: >>> On Wed, Jan 29, 2025 at 09:49:12AM -0700, Daniel Xu wrote: >>> > Hi Ilya, >>> > >>> > On Wed, Jan 29, 2025 at 03:58:54PM +0100, Ilya Leoshkevich wrote: >>> > > On Tue, 2025-01-14 at 13:28 -0700, Daniel Xu wrote: >>> > > > This commit allows progs to elide a null check on statically >>> > > > known >>> > > > map >>> > > > lookup keys. In other words, if the verifier can statically >>> > > > prove >>> > > > that >>> > > > the lookup will be in-bounds, allow the prog to drop the null >>> > > > check. >>> > > > >>> > > > This is useful for two reasons: >>> > > > >>> > > > 1. Large numbers of nullness checks (especially when they >>> > > > cannot >>> > > > fail) >>> > > > unnecessarily pushes prog towards >>> > > > BPF_COMPLEXITY_LIMIT_JMP_SEQ. >>> > > > 2. It forms a tighter contract between programmer and verifier. >>> > > > >>> > > > For (1), bpftrace is starting to make heavier use of percpu >>> > > > scratch >>> > > > maps. As a result, for user scripts with large number of >>> > > > unrolled >>> > > > loops, >>> > > > we are starting to hit jump complexity verification errors. >>> > > > These >>> > > > percpu lookups cannot fail anyways, as we only use static key >>> > > > values. >>> > > > Eliding nullness probably results in less work for verifier as >>> > > > well. >>> > > > >>> > > > For (2), percpu scratch maps are often used as a larger stack, >>> > > > as the >>> > > > currrent stack is limited to 512 bytes. In these situations, it >>> > > > is >>> > > > desirable for the programmer to express: "this lookup should >>> > > > never >>> > > > fail, >>> > > > and if it does, it means I messed up the code". By omitting the >>> > > > null >>> > > > check, the programmer can "ask" the verifier to double check >>> > > > the >>> > > > logic. >>> > > > >>> > > > Tests also have to be updated in sync with these changes, as >>> > > > the >>> > > > verifier is more efficient with this change. Notable, iters.c >>> > > > tests >>> > > > had >>> > > > to be changed to use a map type that still requires null >>> > > > checks, as >>> > > > it's >>> > > > exercising verifier tracking logic w.r.t iterators. >>> > > > >>> > > > Signed-off-by: Daniel Xu <dxu@xxxxxxxxx> >>> > > > --- >>> > > > kernel/bpf/verifier.c | 92 >>> > > > ++++++++++++++++++- >>> > > > tools/testing/selftests/bpf/progs/iters.c | 14 +-- >>> > > > .../selftests/bpf/progs/map_kptr_fail.c | 2 +- >>> > > > .../selftests/bpf/progs/verifier_map_in_map.c | 2 +- >>> > > > .../testing/selftests/bpf/verifier/map_kptr.c | 2 +- >>> > > > 5 files changed, 99 insertions(+), 13 deletions(-) >>> > > >>> > > [...] >>> > > >>> > > > @@ -9158,6 +9216,7 @@ static int check_func_arg(struct >>> > > > bpf_verifier_env *env, u32 arg, >>> > > > enum bpf_arg_type arg_type = fn->arg_type[arg]; >>> > > > enum bpf_reg_type type = reg->type; >>> > > > u32 *arg_btf_id = NULL; >>> > > > + u32 key_size; >>> > > > int err = 0; >>> > > > >>> > > > if (arg_type == ARG_DONTCARE) >>> > > > @@ -9291,8 +9350,13 @@ static int check_func_arg(struct >>> > > > bpf_verifier_env *env, u32 arg, >>> > > > verbose(env, "invalid map_ptr to >>> > > > access map- >>> > > > > key\n"); >>> > > > return -EACCES; >>> > > > } >>> > > > - err = check_helper_mem_access(env, regno, >>> > > > meta- >>> > > > > map_ptr->key_size, >>> > > > - BPF_READ, false, >>> > > > NULL); >>> > > > + key_size = meta->map_ptr->key_size; >>> > > > + err = check_helper_mem_access(env, regno, >>> > > > key_size, >>> > > > BPF_READ, false, NULL); >>> > > > + if (err) >>> > > > + return err; >>> > > > + meta->const_map_key = >>> > > > get_constant_map_key(env, reg, >>> > > > key_size); >>> > > > + if (meta->const_map_key < 0 && meta- >>> > > > >const_map_key >>> > > > != -EOPNOTSUPP) >>> > > > + return meta->const_map_key; >>> > > >>> > > Mark Hartmayer reported a problem that after this commit the >>> > > verifier >>> > > started refusing to load libvirt's virCgroupV2DevicesLoadProg(), >>> > > which >>> > > contains the following snippet: >>> > > >>> > > 53: (b7) r1 = -1 ; R1_w=-1 >>> > > 54: (7b) *(u64 *)(r10 -8) = r1 ; R1_w=-1 R10=fp0 fp-8_w=-1 >>> > > 55: (bf) r2 = r10 ; R2_w=fp0 R10=fp0 >>> > > 56: (07) r2 += -8 ; R2_w=fp-8 >>> > > 57: (18) r1 = 0x9553c800 ; R1_w=map_ptr(ks=8,vs=4) >>> > > 59: (85) call bpf_map_lookup_elem#1 >>> > > >>> > > IIUC here the actual constant value is -1, which this code >>> > > confuses >>> > > with an error. >>> > >>> > Thanks for reporting. I think I know what the issue is - will send >>> > a >>> > patch shortly. >>> > >>> > Daniel >>> > >>> >>> I cribbed the source from [0] and tested before and after. I think >>> this >>> should work. Mind giving it a try? >>> >>> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c >>> index 9971c03adfd5..e9176a5ce215 100644 >>> --- a/kernel/bpf/verifier.c >>> +++ b/kernel/bpf/verifier.c >>> @@ -9206,6 +9206,8 @@ static s64 get_constant_map_key(struct >>> bpf_verifier_env *env, >>> return reg->var_off.value; >>> } >>> >>> +static bool can_elide_value_nullness(enum bpf_map_type type); >>> + >>> static int check_func_arg(struct bpf_verifier_env *env, u32 arg, >>> struct bpf_call_arg_meta *meta, >>> const struct bpf_func_proto *fn, >>> @@ -9354,9 +9356,11 @@ static int check_func_arg(struct >>> bpf_verifier_env *env, u32 arg, >>> err = check_helper_mem_access(env, regno, key_size, >>> BPF_READ, false, NULL); >>> if (err) >>> return err; >>> - meta->const_map_key = get_constant_map_key(env, reg, >>> key_size); >>> - if (meta->const_map_key < 0 && meta->const_map_key != >>> -EOPNOTSUPP) >>> - return meta->const_map_key; >>> + if (can_elide_value_nullness(meta->map_ptr- >>> >map_type)) { >>> + meta->const_map_key = >>> get_constant_map_key(env, reg, key_size); >>> + if (meta->const_map_key < 0 && meta- >>> >const_map_key != -EOPNOTSUPP) >>> + return meta->const_map_key; >>> + } >>> break; >>> case ARG_PTR_TO_MAP_VALUE: >>> if (type_may_be_null(arg_type) && >>> register_is_null(reg)) >>> >>> Thanks, >>> Daniel >>> >>> >>> [0]: >>> https://github.com/libvirt/libvirt/blob/c1166be3475a0269f5164d87fec6227d6cb34b47/src/util/vircgroupv2devices.c#L66-L135 >> >> Thanks, I tried this in isolation and it fixed the issue for me. >> I talked to Mark and he will try it with his libvirt scenario. > > Thanks for testing! > >> >> The code looks reasonable to me, but I have a small concern regarding >> what will happen if the BPF code uses a -EOPNOTSUPP immediate with an >> array. Unlike other immediates, IIUC this will cause check_func_arg() >> to return 0. Is there a reason to have this special? > > That's a good point. Lemme check on that. Seems like the verifier tracks the s32 spill as a u32. So I couldn’t craft a repro for this. But it’ll be a good refactor to make anyways.