On Fri, Nov 15, 2019 at 08:13:58AM +0100, Daniel Borkmann wrote: > On Thu, Nov 14, 2019 at 08:29:41PM -0800, Alexei Starovoitov wrote: > > On Fri, Nov 15, 2019 at 02:04:02AM +0100, Daniel Borkmann wrote: > > > Add tracking of constant keys into tail call maps. The signature of > > > bpf_tail_call_proto is that arg1 is ctx, arg2 map pointer and arg3 > > > is a index key. The direct call approach for tail calls can be enabled > > > if the verifier asserted that for all branches leading to the tail call > > > helper invocation, the map pointer and index key were both constant > > > and the same. Tracking of map pointers we already do from prior work > > > via c93552c443eb ("bpf: properly enforce index mask to prevent out-of-bounds > > > speculation") and 09772d92cd5a ("bpf: avoid retpoline for lookup/update/ > > > delete calls on maps"). Given the tail call map index key is not on > > > stack but directly in the register, we can add similar tracking approach > > > and later in fixup_bpf_calls() add a poke descriptor to the progs poke_tab > > > with the relevant information for the JITing phase. We internally reuse > > > insn->imm for the rewritten BPF_JMP | BPF_TAIL_CALL instruction in order > > > to point into the prog's poke_tab and keep insn->imm == 0 as indicator > > > that current indirect tail call emission must be used. > > > > > > Signed-off-by: Daniel Borkmann <daniel@xxxxxxxxxxxxx> > > > --- > > > include/linux/bpf_verifier.h | 1 + > > > kernel/bpf/verifier.c | 98 ++++++++++++++++++++++++++++++++++++ > > > 2 files changed, 99 insertions(+) > > > > > > diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h > > > index cdd08bf0ec06..f494f0c9ac13 100644 > > > --- a/include/linux/bpf_verifier.h > > > +++ b/include/linux/bpf_verifier.h > > > @@ -301,6 +301,7 @@ struct bpf_insn_aux_data { > > > u32 map_off; /* offset from value base address */ > > > }; > > > }; > > > + u64 key_state; /* constant key tracking for maps */ > > > > may be map_key_state ? > > key_state is a bit ambiguous in the bpf_insn_aux_data. > > Could be, alternatively could also be idx_state or map_idx_state since > it's really just for u32 type key indices. > > > > +static int > > > +record_func_key(struct bpf_verifier_env *env, struct bpf_call_arg_meta *meta, > > > + int func_id, int insn_idx) > > > +{ > > > + struct bpf_insn_aux_data *aux = &env->insn_aux_data[insn_idx]; > > > + struct bpf_reg_state *regs = cur_regs(env), *reg; > > > + struct tnum range = tnum_range(0, U32_MAX); > > > + struct bpf_map *map = meta->map_ptr; > > > + u64 val; > > > + > > > + if (func_id != BPF_FUNC_tail_call) > > > + return 0; > > > + if (!map || map->map_type != BPF_MAP_TYPE_PROG_ARRAY) { > > > + verbose(env, "kernel subsystem misconfigured verifier\n"); > > > + return -EINVAL; > > > + } > > > + > > > + reg = ®s[BPF_REG_3]; > > > + if (!register_is_const(reg) || !tnum_in(range, reg->var_off)) { > > > + bpf_map_key_store(aux, BPF_MAP_KEY_POISON); > > > + return 0; > > > + } > > > + > > > + val = reg->var_off.value; > > > + if (bpf_map_key_unseen(aux)) > > > + bpf_map_key_store(aux, val); > > > + else if (bpf_map_key_immediate(aux) != val) > > > + bpf_map_key_store(aux, BPF_MAP_KEY_POISON); > > > + return 0; > > > +} > > > > I think this analysis is very useful in other cases as well. Could you > > generalize it for array map lookups ? The key used in bpf_map_lookup_elem() for > > arrays is often constant. In such cases we can optimize array_map_gen_lookup() > > into absolute pointer. It will be possible to do > > if (idx < max_entries) ptr += idx * elem_size; > > during verification instead of runtime and the whole > > bpf_map_lookup_elem(map, &key); will become single instruction that > > assigns &array[idx] into R0. > > Was thinking exactly the same. ;-) I started coding this yesterday night [0], > but then had the (in hinsight obvious) realization that as-is the key_state > holds the address but not the index for plain array map lookup. Hence I'd need > to go a step further there to look at the const stack content. Will proceed on > this as a separate set on top. > > [0] https://git.kernel.org/pub/scm/linux/kernel/git/dborkman/bpf.git/commit/?h=pr/bpf-tail-call-rebased2&id=b86b7eae4646d8233e3e9058e68fef27536bf0c4 yeah. good point. For map_lookup it's obvious that the verifier needs to compare both map ptr and *key, but that is the case for bpf_tail_call too, no? It seems tracking 'key_state' only is not enough. Consider: if (..) map = mapA; else map = mapB; bpf_tail_call(ctx, map, 1); May be to generalize the logic the verifier should remember bpf_reg_state instead of specific part of it like u32 ? The verifier keeps insn_aux_data[insn_idx].ptr_type; to prevent incorrect ctx access. That can also be generalized? Probably later, but conceptually it's the same category of tracking that the verifier needs to do. For bpf_map_lookup and bpf_tail_call callsite it can remember bpf_reg_state of r1,r2,r3. The bpf_reg_state should be saved in insn_aux_data the first time the verifier goes through the callsite than everytime the verifier goes through the callsite again additional per-helper logic is invoked. Like for bpf_tail_call it will check: if (tnum_is_const(insn_aux_data[callsite]->r3_reg_state->var_off)) // good. may be can optimize later. and will use insn_aux_data[callsite]->r2_reg_state->map_ptr plus insn_aux_data[callsite]->r3_reg_state->var_off to compute bpf_prog's jited address inside that prog_array. Similarly for bpf_map_lookup... r1_reg_state->map_ptr is the same map for saved insn_aux_data->r1_reg_state and for current->r1. The r2_reg_state should be PTR_TO_STACK and that stack value should be u32 const. Should be a bit more generic and extensible... instead of specific 'key_state' ?