On Mon, Feb 8, 2021 at 10:44 PM Dmitrii Banshchikov <me@xxxxxxxxxxxxx> wrote: > > Add an ability to pass a pointer to a type with known size in arguments > of a global function. Such pointers may be used to overcome the limit on > the maximum number of arguments, avoid expensive and tricky workarounds > and to have multiple output arguments. > > A referenced type may contain pointers but indirect access through them > isn't supported. > > The implementation consists of two parts. If a global function has an > argument that is a pointer to a type with known size then: > > 1) In btf_check_func_arg_match(): check that the corresponding > register points to NULL or to a valid memory region that is large enough > to contain the expected argument's type. > > 2) In btf_prepare_func_args(): set the corresponding register type to > PTR_TO_MEM_OR_NULL and its size to the size of the expected type. > > A call to a global function may change stack's memory slot type(via > check_helper_mem_access) similar to a helper function. That is why new > pointer arguments are allowed only for functions with global linkage. > Consider a case: stack's memory slot type is changed to STACK_MISC from > spilled PTR_TO_PACKET(btf_check_func_arg_match() -> check_mem_reg() -> > check_helper_mem_access() -> check_stack_boundary()) after a call to a > static function and later verifier cannot infer safety of indirect > accesses through the stack memory(check_stack_read() -> > __mark_reg_unknown ()). This will break existing code. Ok, so it took me a while (few attempts and some playing around with static subprogs in selftests) to understand what is this paragraph is talking about. So let me confirm, and maybe we can use that to rephrase this into more clear explanation. So the problem with allowing any (<type> *) argument for static functions is that in such case static function might get successfully validated as if it was global (i.e., based on types of its input arguments). And in such case, corresponding stack slots in the caller program will be marked with MISC types, because in general case we can't know what kind of data is stored in the stack. So here, instead of allowing this for static functions and eventually optimistically validating it with STACK_MISC slots, we will fail upfront and will rely on verifier to fallback to "inline" validation of static functions, which will lead to a proper stack slots marking. It will be less efficient validation, but would preserve more information. For global we don't have the fallback case, so we'll just do our best and will live with MISC slots. Did I get this right? > > Signed-off-by: Dmitrii Banshchikov <me@xxxxxxxxxxxxx> > --- So apart from the very confusing bit about special global func handling here and some concerns about register ID generation, the logic looks good, so: Acked-by: Andrii Nakryiko <andrii@xxxxxxxxxx> > v1 -> v2: > - Allow pointers only for global functions > - Add support of any type with known size > - Use conventional way to generate reg id > - Use boolean true/false instead of int 1/0 > - Fix formatting > > include/linux/bpf_verifier.h | 2 ++ > kernel/bpf/btf.c | 55 +++++++++++++++++++++++++++++------- > kernel/bpf/verifier.c | 30 ++++++++++++++++++++ > 3 files changed, 77 insertions(+), 10 deletions(-) > [...] > @@ -5470,9 +5488,26 @@ int btf_prepare_func_args(struct bpf_verifier_env *env, int subprog, > reg->type = SCALAR_VALUE; > continue; > } > - if (btf_type_is_ptr(t) && > - btf_get_prog_ctx_type(log, btf, t, prog_type, i)) { > - reg->type = PTR_TO_CTX; > + if (btf_type_is_ptr(t)) { > + if (btf_get_prog_ctx_type(log, btf, t, prog_type, i)) { > + reg->type = PTR_TO_CTX; > + continue; > + } > + > + t = btf_type_skip_modifiers(btf, t->type, NULL); > + > + ref_t = btf_resolve_size(btf, t, ®->mem_size); > + if (IS_ERR(ref_t)) { > + bpf_log(log, > + "arg#%d reference type('%s %s') size cannot be determined: %ld\n", > + i, btf_type_str(t), btf_name_by_offset(btf, t->name_off), > + PTR_ERR(ref_t)); > + return -EINVAL; > + } > + > + reg->type = PTR_TO_MEM_OR_NULL; > + reg->id = i + 1; I see that in a bunch of other places we use reg->id = ++env->id_gen; to generate register IDs, that looks safer and should avoid any accidental ID collision. Is there any reason not to do that here? > + > continue; > } > bpf_log(log, "Arg#%d type %s in %s() is not supported yet.\n", > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > index d68ea6eb4f9b..fd423af1cc57 100644 > --- a/kernel/bpf/verifier.c > +++ b/kernel/bpf/verifier.c > @@ -3942,6 +3942,29 @@ static int check_helper_mem_access(struct bpf_verifier_env *env, int regno, > } > } > > +int check_mem_reg(struct bpf_verifier_env *env, struct bpf_reg_state *reg, > + u32 regno, u32 mem_size) > +{ > + if (register_is_null(reg)) > + return 0; > + > + if (reg_type_may_be_null(reg->type)) { > + /* Assuming that the register contains a value check if the memory > + * access is safe. Temorarily save and restore the register's state as typo: temporarily > + * the conversion shouldn't be visible to a caller. > + */ > + const struct bpf_reg_state saved_reg = *reg; > + int rv; > + > + mark_ptr_not_null_reg(reg); > + rv = check_helper_mem_access(env, regno, mem_size, true, NULL); > + *reg = saved_reg; > + return rv; > + } > + > + return check_helper_mem_access(env, regno, mem_size, true, NULL); > +} > + > /* Implementation details: > * bpf_map_lookup returns PTR_TO_MAP_VALUE_OR_NULL > * Two bpf_map_lookups (even with the same key) will have different reg->id. > @@ -11572,6 +11595,13 @@ static int do_check_common(struct bpf_verifier_env *env, int subprog) > mark_reg_known_zero(env, regs, i); > else if (regs[i].type == SCALAR_VALUE) > mark_reg_unknown(env, regs, i); > + else if (regs[i].type == PTR_TO_MEM_OR_NULL) { > + const u32 mem_size = regs[i].mem_size; > + > + mark_reg_known_zero(env, regs, i); > + regs[i].mem_size = mem_size; > + regs[i].id = ++env->id_gen; > + } > } > } else { > /* 1st arg to a function */ > -- > 2.25.1 >