On Fri, 13 Dec 2024 at 21:06, Eduard Zingerman <eddyz87@xxxxxxxxx> wrote: > > On Fri, 2024-12-13 at 09:51 -0800, Kumar Kartikeya Dwivedi wrote: > > Arguments to a raw tracepoint are tagged as trusted, which carries the > > semantics that the pointer will be non-NULL. However, in certain cases, > > a raw tracepoint argument may end up being NULL. More context about this > > issue is available in [0]. > > > > Thus, there is a discrepancy between the reality, that raw_tp arguments can > > actually be NULL, and the verifier's knowledge, that they are never NULL, > > causing explicit NULL check branch to be dead code eliminated. > > > > A previous attempt [1], i.e. the second fixed commit, was made to > > simulate symbolic execution as if in most accesses, the argument is a > > non-NULL raw_tp, except for conditional jumps. This tried to suppress > > branch prediction while preserving compatibility, but surfaced issues > > with production programs that were difficult to solve without increasing > > verifier complexity. A more complete discussion of issues and fixes is > > available at [2]. > > > > Fix this by maintaining an explicit list of tracepoints where the > > arguments are known to be NULL, and mark the positional arguments as > > PTR_MAYBE_NULL. Additionally, capture the tracepoints where arguments > > are known to be ERR_PTR, and mark these arguments as scalar values to > > prevent potential dereference. > > > > Each hex digit is used to encode NULL-ness (0x1) or ERR_PTR-ness (0x2), > > shifted by the zero-indexed argument number x 4. This can be represented > > as follows: > > 1st arg: 0x1 > > 2nd arg: 0x10 > > 3rd arg: 0x100 > > ... and so on (likewise for ERR_PTR case). > > > > In the future, an automated pass will be used to produce such a list, or > > insert __nullable annotations automatically for tracepoints. Each > > compilation unit will be analyzed and results will be collated to find > > whether a tracepoint pointer is definitely not null, maybe null, or an > > unknown state where verifier conservatively marks it PTR_MAYBE_NULL. > > A proof of concept of this tool from Eduard is available at [3]. > > > > Note that in case we don't find a specification in the raw_tp_null_args > > array and the tracepoint belongs to a kernel module, we will > > conservatively mark the arguments as PTR_MAYBE_NULL. This is because > > unlike for in-tree modules, out-of-tree module tracepoints may pass NULL > > freely to the tracepoint. We don't protect against such tracepoints > > passing ERR_PTR (which is uncommon anyway), lest we mark all such > > arguments as SCALAR_VALUE. > > > > While we are it, let's adjust the test raw_tp_null to not perform > > dereference of the skb->mark, as that won't be allowed anymore, and make > > it more robust by using inline assembly to test the dead code > > elimination behavior, which should still stay the same. > > > > [0]: https://lore.kernel.org/bpf/ZrCZS6nisraEqehw@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx > > [1]: https://lore.kernel.org/all/20241104171959.2938862-1-memxor@xxxxxxxxx > > [2]: https://lore.kernel.org/bpf/20241206161053.809580-1-memxor@xxxxxxxxx > > [3]: https://github.com/eddyz87/llvm-project/tree/nullness-for-tracepoint-params > > > > Reported-by: Juri Lelli <juri.lelli@xxxxxxxxxx> # original bug > > Reported-by: Manu Bretelle <chantra@xxxxxxxx> # bugs in masking fix > > Fixes: 3f00c5239344 ("bpf: Allow trusted pointers to be passed to KF_TRUSTED_ARGS kfuncs") > > Fixes: cb4158ce8ec8 ("bpf: Mark raw_tp arguments with PTR_MAYBE_NULL") > > Co-developed-by: Jiri Olsa <jolsa@xxxxxxxxxx> > > Signed-off-by: Jiri Olsa <jolsa@xxxxxxxxxx> > > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@xxxxxxxxx> > > --- > > Tbh, I think we should have fixed the bug in what is currently in the > tree and avoid revert. Anyways, the code looks good to me. > > Reviewed-by: Eduard Zingerman <eddyz87@xxxxxxxxx> > Thanks for the review, I have addressed the nits and resent v3. > [...] > > > @@ -6597,6 +6693,39 @@ bool btf_ctx_access(int off, int size, enum bpf_access_type type, > > if (btf_param_match_suffix(btf, &args[arg], "__nullable")) > > info->reg_type |= PTR_MAYBE_NULL; > > > > + if (prog->expected_attach_type == BPF_TRACE_RAW_TP) { > > + struct btf *btf = prog->aux->attach_btf; > > + const struct btf_type *t; > > + const char *tname; > > + > > + /* BTF lookups cannot fail, return false on error */ > > + t = btf_type_by_id(btf, prog->aux->attach_btf_id); > > + if (!t) > > + return false; > > + tname = btf_name_by_offset(btf, t->name_off); > > + if (!tname) > > + return false; > > + /* Checked by bpf_check_attach_target */ > > + tname += sizeof("bpf_trace_") - 1; > > Nit: bpf_check_attach_target uses "btf_trace_" prefix. > > > + for (i = 0; i < ARRAY_SIZE(raw_tp_null_args); i++) { > > + /* Is this a func with potential NULL args? */ > > + if (strcmp(tname, raw_tp_null_args[i].func)) > > + continue; > > + if (raw_tp_null_args[i].mask & (0x1 << (arg * 4))) > > + info->reg_type |= PTR_MAYBE_NULL; > > + /* Is the current arg IS_ERR? */ > > + if (raw_tp_null_args[i].mask & (0x2 << (arg * 4))) > > + ptr_err_raw_tp = true; > > + break; > > + } > > + /* If we don't know NULL-ness specification and the tracepoint > > + * is coming from a loadable module, be conservative and mark > > + * argument as PTR_MAYBE_NULL. > > + */ > > + if (i == ARRAY_SIZE(raw_tp_null_args) && btf_is_module(btf)) > > + info->reg_type |= PTR_MAYBE_NULL; > > + } > > + > > if (tgt_prog) { > > enum bpf_prog_type tgt_type; > > > > @@ -6641,6 +6770,13 @@ bool btf_ctx_access(int off, int size, enum bpf_access_type type, > > bpf_log(log, "func '%s' arg%d has btf_id %d type %s '%s'\n", > > tname, arg, info->btf_id, btf_type_str(t), > > __btf_name_by_offset(btf, t->name_off)); > > + > > + /* Perform all checks on the validity of type for this argument, but if > > + * we know it can be IS_ERR at runtime, scrub pointer type and mark as > > + * scalar. > > + */ > > + if (ptr_err_raw_tp) > > + info->reg_type = SCALAR_VALUE; > > Nit: the log line above would be a bit confusing if 'ptr_err_raw_tp' would be true. > maybe add an additional line here, saying that verifier overrides BTF type? > > > return true; > > } > > EXPORT_SYMBOL_GPL(btf_ctx_access); > > [...] >