On Sun, Nov 3, 2024 at 9:01 AM Kumar Kartikeya Dwivedi <memxor@xxxxxxxxx> wrote: > > On Sun, 3 Nov 2024 at 10:40, Kumar Kartikeya Dwivedi <memxor@xxxxxxxxx> wrote: > > > > On Sun, 3 Nov 2024 at 10:16, Kumar Kartikeya Dwivedi <memxor@xxxxxxxxx> wrote: > > > > > > On Fri, 1 Nov 2024 at 17:56, Alexei Starovoitov > > > <alexei.starovoitov@xxxxxxxxx> wrote: > > > > > > > > On Fri, Nov 1, 2024 at 12:16 PM Andrii Nakryiko > > > > <andrii.nakryiko@xxxxxxxxx> wrote: > > > > > > > > > > > @@ -6693,7 +6709,21 @@ static int check_ptr_to_btf_access(struct bpf_verifier_env *env, > > > > > > > > > > > > if (ret < 0) > > > > > > return ret; > > > > > > - > > > > > > + /* For raw_tp progs, we allow dereference of PTR_MAYBE_NULL > > > > > > + * trusted PTR_TO_BTF_ID, these are the ones that are possibly > > > > > > + * arguments to the raw_tp. Since internal checks in for trusted > > > > > > + * reg in check_ptr_to_btf_access would consider PTR_MAYBE_NULL > > > > > > + * modifier as problematic, mask it out temporarily for the > > > > > > + * check. Don't apply this to pointers with ref_obj_id > 0, as > > > > > > + * those won't be raw_tp args. > > > > > > + * > > > > > > + * We may end up applying this relaxation to other trusted > > > > > > + * PTR_TO_BTF_ID with maybe null flag, since we cannot > > > > > > + * distinguish PTR_MAYBE_NULL tagged for arguments vs normal > > > > > > + * tagging, but that should expand allowed behavior, and not > > > > > > + * cause regression for existing behavior. > > > > > > + */ > > > > > > > > > > Yeah, I'm not sure why this has to be raw tp-specific?.. What's wrong > > > > > with the same behavior for BPF iterator programs, for example? > > > > > > > > > > It seems nicer if we can avoid this temporary masking and instead > > > > > support this as a generic functionality? Or are there complications? > > > > > > > > > > > We _can_ do this for all programs. The thought process here was to > > > leave existing raw_tp programs unbroken if possible if we're marking > > > their arguments as PTR_MAYBE_NULL, since most of them won't be > > > performing any NULL checks at all. > > > > > > > > > + mask = mask_raw_tp_reg(env, reg); > > > > > > if (ret != PTR_TO_BTF_ID) { > > > > > > /* just mark; */ > > > > > > > > > > > > @@ -6754,8 +6784,13 @@ static int check_ptr_to_btf_access(struct bpf_verifier_env *env, > > > > > > clear_trusted_flags(&flag); > > > > > > } > > > > > > > > > > > > - if (atype == BPF_READ && value_regno >= 0) > > > > > > + if (atype == BPF_READ && value_regno >= 0) { > > > > > > mark_btf_ld_reg(env, regs, value_regno, ret, reg->btf, btf_id, flag); > > > > > > + /* We've assigned a new type to regno, so don't undo masking. */ > > > > > > + if (regno == value_regno) > > > > > > + mask = false; > > > > > > + } > > > > > > + unmask_raw_tp_reg(reg, mask); > > > > > > > > Kumar, > > > > > > > > I chatted with Andrii offline. All other cases of mask/unmask > > > > should probably stay raw_tp specific, but it seems we can make > > > > this particular case to be generic. > > > > Something like the following: > > > > > > > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > > > > index 797cf3ed32e0..bbd4c03460e3 100644 > > > > --- a/kernel/bpf/verifier.c > > > > +++ b/kernel/bpf/verifier.c > > > > @@ -6703,7 +6703,11 @@ static int check_ptr_to_btf_access(struct > > > > bpf_verifier_env *env, > > > > */ > > > > flag = PTR_UNTRUSTED; > > > > > > > > + } else if (reg->type == (PTR_TO_BTF_ID | PTR_TRUSTED | > > > > PTR_MAYBE_NULL)) { > > > > + flag |= PTR_MAYBE_NULL; > > > > + goto trusted; > > > > } else if (is_trusted_reg(reg) || is_rcu_reg(reg)) { > > > > +trusted: > > > > > > > > With the idea that trusted_or_null stays that way for all prog > > > > types and bpf_iter__task->task deref stays trusted_or_null > > > > instead of being downgraded to ptr_to_btf_id without any flags. > > > > So progs can do few less != null checks. > > > > Need to think it through. > > > > > > Ok. But don't allow passing such pointers into helpers, right? > > > We do that for raw_tp to preserve compat, but it would just exacerbate > > > the issue if we start doing it everywhere. > > > So it's just that dereferencing a _or_null pointer becomes an ok thing to do? > > > Let me mull over this for a bit. > > > > > > I'm not sure whether not doing the NULL check is better or worse > > > though. On one hand everything will work without checking for NULL, on > > > the other hand people may also assume the verifier isn't complaining > > > because the pointer is valid, and then they read data from the pointer > > > which always ends up being zero, meaning different things for > > > different kinds of fields. > > > > > > Just thinking out loud, but one of the other concerns would be that > > > we're encouraging people not to do these NULL checks, which means a > > > potential page fault penalty everytime that pointer _is_ NULL, instead > > > of a simple branch, which would certainly be a bit expensive. If this > > > becomes the common case, I think the prog execution latency penalty > > > will be big. It is something to consider. > > > > Ah, no, my bad, this won't be a problem now, as the JIT does emit a > > branch to check for kernel addresses, but it probably will be if > > https://lore.kernel.org/bpf/20240619092216.1780946-1-memxor@xxxxxxxxx/ > > gets accepted. > > I applied this and tried to find out the time it takes to dereference > a NULL pointer using bpf_ktime_get_ns > With the patch above: time=3345 ns > Without (bpf-next right now): time=170 ns > > So I guess that means I should probably drop the patch above if we > decide to allow dereferencing NULL for all programs. Your concerns are valid. Accepting deref of trusted_or_null generically will cause these issues long term. It's indeed better to make programmers add explicit !=NULL to their programs. So scratch my earlier suggestion. Let's keep this patch as-is with special hack for raw_tp only that we will hopefully resolve later either with __nullable suffix or compiler detection of nullability. The latter we discussed briefly with Eduard. It's doable to teach llvm to emit btf_tag to aid the verifier. gcc may catch up eventually.