On Wed, Feb 10, 2021 at 03:32:48PM -0800, Andrii Nakryiko wrote: > On Mon, Feb 8, 2021 at 10:44 PM Dmitrii Banshchikov <me@xxxxxxxxxxxxx> wrote: > > > > Add an ability to pass a pointer to a type with known size in arguments > > of a global function. Such pointers may be used to overcome the limit on > > the maximum number of arguments, avoid expensive and tricky workarounds > > and to have multiple output arguments. > > > > A referenced type may contain pointers but indirect access through them > > isn't supported. > > > > The implementation consists of two parts. If a global function has an > > argument that is a pointer to a type with known size then: > > > > 1) In btf_check_func_arg_match(): check that the corresponding > > register points to NULL or to a valid memory region that is large enough > > to contain the expected argument's type. > > > > 2) In btf_prepare_func_args(): set the corresponding register type to > > PTR_TO_MEM_OR_NULL and its size to the size of the expected type. > > > > A call to a global function may change stack's memory slot type(via > > check_helper_mem_access) similar to a helper function. That is why new > > pointer arguments are allowed only for functions with global linkage. > > Consider a case: stack's memory slot type is changed to STACK_MISC from > > spilled PTR_TO_PACKET(btf_check_func_arg_match() -> check_mem_reg() -> > > check_helper_mem_access() -> check_stack_boundary()) after a call to a > > static function and later verifier cannot infer safety of indirect > > accesses through the stack memory(check_stack_read() -> > > __mark_reg_unknown ()). This will break existing code. > > Ok, so it took me a while (few attempts and some playing around with > static subprogs in selftests) to understand what is this paragraph is > talking about. So let me confirm, and maybe we can use that to > rephrase this into more clear explanation. > > So the problem with allowing any (<type> *) argument for static > functions is that in such case static function might get successfully > validated as if it was global (i.e., based on types of its input > arguments). And in such case, corresponding stack slots in the caller > program will be marked with MISC types, because in general case we > can't know what kind of data is stored in the stack. > > So here, instead of allowing this for static functions and eventually > optimistically validating it with STACK_MISC slots, we will fail > upfront and will rely on verifier to fallback to "inline" validation > of static functions, which will lead to a proper stack slots marking. > It will be less efficient validation, but would preserve more > information. For global we don't have the fallback case, so we'll just > do our best and will live with MISC slots. > > Did I get this right? Yes and sorry for the cryptic issue description. A slot type may be changed due to the fact that after a call to global or a helper function we don't have a guarantee that the data won't changed. The only guarantee we have is that it is still a valid memory and hence we have to change its slot type to MISC(because possibly it have been overwritten by a helper or a global function and we cannot rely on the previous type). Allowing pointers in arguments for static functions breaks existing code because verifier is no longer able to infer safety of memory(due to the slot type change). Hence we have to allow it only for global functions. The verifier continues to use "inline" validation for static functions (we are not making it worse) and more types of global functions are supported(we are making it better). I will rephrase the commit message in v3. > > > > > Signed-off-by: Dmitrii Banshchikov <me@xxxxxxxxxxxxx> > > --- > > So apart from the very confusing bit about special global func > handling here and some concerns about register ID generation, the > logic looks good, so: > > Acked-by: Andrii Nakryiko <andrii@xxxxxxxxxx> Thank you. I will address wording, ID generation and typo in v3. > > > v1 -> v2: > > - Allow pointers only for global functions > > - Add support of any type with known size > > - Use conventional way to generate reg id > > - Use boolean true/false instead of int 1/0 > > - Fix formatting > > > > include/linux/bpf_verifier.h | 2 ++ > > kernel/bpf/btf.c | 55 +++++++++++++++++++++++++++++------- > > kernel/bpf/verifier.c | 30 ++++++++++++++++++++ > > 3 files changed, 77 insertions(+), 10 deletions(-) > > > > [...] > > > @@ -5470,9 +5488,26 @@ int btf_prepare_func_args(struct bpf_verifier_env *env, int subprog, > > reg->type = SCALAR_VALUE; > > continue; > > } > > - if (btf_type_is_ptr(t) && > > - btf_get_prog_ctx_type(log, btf, t, prog_type, i)) { > > - reg->type = PTR_TO_CTX; > > + if (btf_type_is_ptr(t)) { > > + if (btf_get_prog_ctx_type(log, btf, t, prog_type, i)) { > > + reg->type = PTR_TO_CTX; > > + continue; > > + } > > + > > + t = btf_type_skip_modifiers(btf, t->type, NULL); > > + > > + ref_t = btf_resolve_size(btf, t, ®->mem_size); > > + if (IS_ERR(ref_t)) { > > + bpf_log(log, > > + "arg#%d reference type('%s %s') size cannot be determined: %ld\n", > > + i, btf_type_str(t), btf_name_by_offset(btf, t->name_off), > > + PTR_ERR(ref_t)); > > + return -EINVAL; > > + } > > + > > + reg->type = PTR_TO_MEM_OR_NULL; > > + reg->id = i + 1; > > I see that in a bunch of other places we use reg->id = ++env->id_gen; > to generate register IDs, that looks safer and should avoid any > accidental ID collision. Is there any reason not to do that here? Yes, you are right - I somehow lost this while working on the second version. I will double check that it works as expected in v3. > > > + > > continue; > > } > > bpf_log(log, "Arg#%d type %s in %s() is not supported yet.\n", > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > > index d68ea6eb4f9b..fd423af1cc57 100644 > > --- a/kernel/bpf/verifier.c > > +++ b/kernel/bpf/verifier.c > > @@ -3942,6 +3942,29 @@ static int check_helper_mem_access(struct bpf_verifier_env *env, int regno, > > } > > } > > > > +int check_mem_reg(struct bpf_verifier_env *env, struct bpf_reg_state *reg, > > + u32 regno, u32 mem_size) > > +{ > > + if (register_is_null(reg)) > > + return 0; > > + > > + if (reg_type_may_be_null(reg->type)) { > > + /* Assuming that the register contains a value check if the memory > > + * access is safe. Temorarily save and restore the register's state as > > typo: temporarily > > > + * the conversion shouldn't be visible to a caller. > > + */ > > + const struct bpf_reg_state saved_reg = *reg; > > + int rv; > > + > > + mark_ptr_not_null_reg(reg); > > + rv = check_helper_mem_access(env, regno, mem_size, true, NULL); > > + *reg = saved_reg; > > + return rv; > > + } > > + > > + return check_helper_mem_access(env, regno, mem_size, true, NULL); > > +} > > + > > /* Implementation details: > > * bpf_map_lookup returns PTR_TO_MAP_VALUE_OR_NULL > > * Two bpf_map_lookups (even with the same key) will have different reg->id. > > @@ -11572,6 +11595,13 @@ static int do_check_common(struct bpf_verifier_env *env, int subprog) > > mark_reg_known_zero(env, regs, i); > > else if (regs[i].type == SCALAR_VALUE) > > mark_reg_unknown(env, regs, i); > > + else if (regs[i].type == PTR_TO_MEM_OR_NULL) { > > + const u32 mem_size = regs[i].mem_size; > > + > > + mark_reg_known_zero(env, regs, i); > > + regs[i].mem_size = mem_size; > > + regs[i].id = ++env->id_gen; > > + } > > } > > } else { > > /* 1st arg to a function */ > > -- > > 2.25.1 > > -- Dmitrii Banshchikov