Re: [PATCH bpf-next v4 1/2] bpf: verifier: Support eliding map lookup nullness

Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> · Thu, 3 Oct 2024 17:37:18 -0700

On Wed, Oct 2, 2024 at 5:12 PM Daniel Xu <dxu@xxxxxxxxx> wrote:
>
> This commit allows progs to elide a null check on statically known map
> lookup keys. In other words, if the verifier can statically prove that
> the lookup will be in-bounds, allow the prog to drop the null check.
>
> This is useful for two reasons:
>
> 1. Large numbers of nullness checks (especially when they cannot fail)
>    unnecessarily pushes prog towards BPF_COMPLEXITY_LIMIT_JMP_SEQ.
> 2. It forms a tighter contract between programmer and verifier.
>
> For (1), bpftrace is starting to make heavier use of percpu scratch
> maps. As a result, for user scripts with large number of unrolled loops,
> we are starting to hit jump complexity verification errors.  These
> percpu lookups cannot fail anyways, as we only use static key values.
> Eliding nullness probably results in less work for verifier as well.
>
> For (2), percpu scratch maps are often used as a larger stack, as the
> currrent stack is limited to 512 bytes. In these situations, it is
> desirable for the programmer to express: "this lookup should never fail,
> and if it does, it means I messed up the code". By omitting the null
> check, the programmer can "ask" the verifier to double check the logic.
>
> Tests also have to be updated in sync with these changes, as the
> verifier is more efficient with this change. Notable, iters.c tests had
> to be changed to use a map type that still requires null checks, as it's
> exercising verifier tracking logic w.r.t iterators.
>
> Acked-by: Eduard Zingerman <eddyz87@xxxxxxxxx>
> Signed-off-by: Daniel Xu <dxu@xxxxxxxxx>
> ---
>  kernel/bpf/verifier.c                         | 73 ++++++++++++++++++-
>  tools/testing/selftests/bpf/progs/iters.c     | 14 ++--
>  .../selftests/bpf/progs/map_kptr_fail.c       |  2 +-
>  .../selftests/bpf/progs/verifier_map_in_map.c |  2 +-
>  .../testing/selftests/bpf/verifier/map_kptr.c |  2 +-
>  5 files changed, 82 insertions(+), 11 deletions(-)
>
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 7d9b38ffd220..9ee2cd6c0508 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -284,6 +284,7 @@ struct bpf_call_arg_meta {
>         u32 ret_btf_id;
>         u32 subprogno;
>         struct btf_field *kptr_field;
> +       long const_map_key;

key might collide with special -1 on 32-bit archs ?
s64 instead ?

>  };
>
>  struct bpf_kfunc_call_arg_meta {
> @@ -10416,6 +10417,60 @@ static void update_loop_inline_state(struct bpf_verifier_env *env, u32 subprogno
>                                  state->callback_subprogno == subprogno);
>  }
>
> +/* Returns whether or not the given map type can potentially elide
> + * lookup return value nullness check. This is possible if the key
> + * is statically known.
> + */
> +static bool can_elide_value_nullness(enum bpf_map_type type)
> +{
> +       switch (type) {
> +       case BPF_MAP_TYPE_ARRAY:
> +       case BPF_MAP_TYPE_PERCPU_ARRAY:
> +               return true;
> +       default:
> +               return false;
> +       }
> +}
> +
> +/* Returns constant key value if possible, else -1 */
> +static long get_constant_map_key(struct bpf_verifier_env *env,
> +                                struct bpf_reg_state *key)
> +{
> +       struct bpf_func_state *state = func(env, key);
> +       struct bpf_reg_state *reg;
> +       int stack_off;
> +       int slot;
> +       int spi;
> +
> +       if (!env->bpf_capable)
> +               return -1;
> +       if (key->type != PTR_TO_STACK)
> +               return -1;
> +       if (!tnum_is_const(key->var_off))
> +               return -1;
> +
> +       stack_off = key->off + key->var_off.value;
> +       slot = -stack_off - 1;
> +       if (slot < 0)
> +               /* Stack grew upwards. This is properly checked
> +                * as part of helper argument processing, but since
> +                * this runs before those checks, we need to preemptively
> +                * do this here.
> +                */
> +               return -1;
> +       else if (slot >= state->allocated_stack)
> +               /* Stack uninitialized */
> +               return -1;
> +
> +       spi = slot / BPF_REG_SIZE;
> +       reg = &state->stack[spi].spilled_ptr;

reg->type == SCALAR
check is missing.
slot->type should also be checked.
spiller_ptr is only correct for STACK_SPILL.

it should also recognize that STACK_ZERO means zero.

> +       if (!tnum_is_const(reg->var_off))
> +               /* Stack value not statically known */
> +               return -1;

I suspect tnum_is_const could pass for pointers and other !scalar.

> +
> +       return reg->var_off.value;

and this will be zero by accident.

Bits of check_stack_read_fixed_off() should probably be
factored out and used here.

> +}
> +
>  static int get_helper_proto(struct bpf_verifier_env *env, int func_id,
>                             const struct bpf_func_proto **ptr)
>  {
> @@ -10513,6 +10568,15 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn
>                         env->insn_aux_data[insn_idx].storage_get_func_atomic = true;
>         }
>
> +       /* Logically we are trying to check on key register state before
> +        * the helper is called, so process here. Otherwise argument processing
> +        * may clobber the spilled key values.
> +        */
> +       regs = cur_regs(env);
> +       if (func_id == BPF_FUNC_map_lookup_elem)
> +               meta.const_map_key = get_constant_map_key(env, &regs[BPF_REG_2]);
> +
> +

I think the logic is too specific.
Let's make it more generic.
It probably better to do later in:
case ARG_PTR_TO_MAP_KEY:
after check_helper_mem_access() returns 0
and store into meta.const_map_key.

>         meta.func_id = func_id;
>         /* check args */
>         for (i = 0; i < MAX_BPF_FUNC_REG_ARGS; i++) {
> @@ -10773,10 +10837,17 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn
>                                 "kernel subsystem misconfigured verifier\n");
>                         return -EINVAL;
>                 }
> +
> +               if (func_id == BPF_FUNC_map_lookup_elem &&

and only here check for func_id.

Saving meta.const_map_key can be unconditional for all ARG_PTR_TO_MAP_KEY.

We can use it in a follow up to optimize lookup/update from arrays.
Currently array_map_gen_lookup() still emits a load from key
and bounds check.
All that can be optimized with const_map_key.

> +                   can_elide_value_nullness(meta.map_ptr->map_type) &&
> +                   meta.const_map_key >= 0 &&
> +                   meta.const_map_key < meta.map_ptr->max_entries)
> +                       ret_flag &= ~PTR_MAYBE_NULL;
> +
>                 regs[BPF_REG_0].map_ptr = meta.map_ptr;
>                 regs[BPF_REG_0].map_uid = meta.map_uid;
>                 regs[BPF_REG_0].type = PTR_TO_MAP_VALUE | ret_flag;
> -               if (!type_may_be_null(ret_type) &&
> +               if (!type_may_be_null(regs[BPF_REG_0].type) &&

I would use ret_flag here instead of R0.type.

pw-bot: cr