Re: [PATCH bpf-next 5/8] bpf: add __arg_trusted and __arg_untrusted global func tags

Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> · Fri, 12 Jan 2024 10:43:37 -0800

On Thu, Jan 11, 2024 at 6:05 PM Alexei Starovoitov
<alexei.starovoitov@xxxxxxxxx> wrote:
>
> On Thu, Jan 4, 2024 at 4:09 PM Andrii Nakryiko <andrii@xxxxxxxxxx> wrote:
> >
> > Add support for passing PTR_TO_BTF_ID registers to global subprogs.
> > Currently two flavors are supported: trusted and untrusted. In both
> > cases we assume _OR_NULL variants, so user would be forced to do a NULL
> > check. __arg_nonnull can be used in conjunction with
> > __arg_{trusted,untrusted} annotation to avoid unnecessary NULL checks,
> > and subprog caller will be forced to provided known non-NULL pointers.
> >
> > Alternatively, we can add __arg_nullable and change default to be
> > non-_OR_NULL, and __arg_nullable will allow to force NULL checks, if
> > necessary. This probably would be a better usability overall, so let's
> > discuss this.
>
> Let's make __arg_trusted to be non-null by default.
> I think it better matches existing kfuncs with KF_TRUSTED_ARGS.

yep, agreed, already did locally

>
> Also pls use __arg_maybe_null name. "nullable" is difficult
> to read and understand. maybe_null matches our internal names too.

not happy about verbose "maybe_null" naming, tbh, but I don't want to
bikeshed. I like "nullable" because it's a well-known concept across
many languages, so it doesn't seem to be that alien. And it nicely
complements __arg_nonnull.

But I'll rename it because I don't want bikeshedding.

>
> Another thought..
> may be add __arg_trusted_or_null alias that
> will emit two decl tags "arg:trusted", "arg:maybe_null" ?

no, I want to keep them separate. In the future this maybe_null can be
combined with some other tags, I don't want to create all combinations
as macros. I think __arg_trusted __arg_maybe_null is totally fine as
an annotation and doesn't save much in terms of typing if it's
combined into a single macro.

>
> More below.
>
> > Note, we disallow global subprogs to destroy passed in PTR_TO_BTF_ID
> > arguments, even the trusted one. We achieve that by not setting
> > ref_obj_id when validating subprog code. This basically enforces (in
> > Rust terms) borrowing semantics vs move semantics. Borrowing semantics
> > seems to be a better fit for isolated global subprog validation
> > approach.
> >
> > Implementation-wise, we utilize existing logic for matching
> > user-provided BTF type to kernel-side BTF type, used by BPF CO-RE logic
> > and following same matching rules. We enforce a unique match for types.
> >
> > Signed-off-by: Andrii Nakryiko <andrii@xxxxxxxxxx>
> > ---
> >  include/linux/bpf_verifier.h                  |   1 +
> >  kernel/bpf/btf.c                              | 103 +++++++++++++++---
> >  kernel/bpf/verifier.c                         |  31 ++++++
> >  .../bpf/progs/nested_trust_failure.c          |   2 +-
> >  4 files changed, 123 insertions(+), 14 deletions(-)
> >
> > diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
> > index d07d857ca67f..eaea129c38ff 100644
> > --- a/include/linux/bpf_verifier.h
> > +++ b/include/linux/bpf_verifier.h
> > @@ -610,6 +610,7 @@ struct bpf_subprog_arg_info {
> >         enum bpf_arg_type arg_type;
> >         union {
> >                 u32 mem_size;
> > +               u32 btf_id;
> >         };
> >  };
> >
> > diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
> > index 368d8fe19d35..287a0581516e 100644
> > --- a/kernel/bpf/btf.c
> > +++ b/kernel/bpf/btf.c
> > @@ -6802,9 +6802,78 @@ static bool btf_is_dynptr_ptr(const struct btf *btf, const struct btf_type *t)
> >         return false;
> >  }
> >
> > +struct bpf_cand_cache {
> > +       const char *name;
> > +       u32 name_len;
> > +       u16 kind;
> > +       u16 cnt;
> > +       struct {
> > +               const struct btf *btf;
> > +               u32 id;
> > +       } cands[];
> > +};
> > +
> > +static DEFINE_MUTEX(cand_cache_mutex);
> > +
> > +static struct bpf_cand_cache *
> > +bpf_core_find_cands(struct bpf_core_ctx *ctx, u32 local_type_id);
> > +
> > +static int btf_get_ptr_to_btf_id(struct bpf_verifier_log *log, int arg_idx,
> > +                                const struct btf *btf, const struct btf_type *t)
> > +{
> > +       struct bpf_cand_cache *cc;
> > +       struct bpf_core_ctx ctx = {
> > +               .btf = btf,
> > +               .log = log,
> > +       };
> > +       u32 kern_type_id, type_id;
> > +       int err = 0;
> > +
> > +       /* skip PTR and modifiers */
> > +       type_id = t->type;
> > +       t = btf_type_by_id(btf, t->type);
> > +       while (btf_type_is_modifier(t)) {
> > +               type_id = t->type;
> > +               t = btf_type_by_id(btf, t->type);
> > +       }
> > +
> > +       mutex_lock(&cand_cache_mutex);
> > +       cc = bpf_core_find_cands(&ctx, type_id);
>
> heh. core-in-the-kernel got another use case :)

it's probably also what we should use in btf_translate_to_vmlinux(),
btw, and is the reason we got into this sk_buff vs __sk_buff confusion
in the first place. But that's not part of my upcoming changes.

>
> This allows flavors as well, so
> bpf_prog(struct task_struct___v1 *tsk __arg_trusted)
> will find the match.
> I think it's fine to do. But let's add a test too.
> I don't think patch 8 checks that.

Good point, will add.

>
> > +       if (IS_ERR(cc)) {
> > +               err = PTR_ERR(cc);
> > +               bpf_log(log, "arg#%d reference type('%s %s') candidate matching error: %d\n",
> > +                       arg_idx, btf_type_str(t), __btf_name_by_offset(btf, t->name_off),
> > +                       err);
> > +               goto cand_cache_unlock;
> > +       }
> > +       if (cc->cnt != 1) {
> > +               bpf_log(log, "arg#%d reference type('%s %s') %s\n",
> > +                       arg_idx, btf_type_str(t), __btf_name_by_offset(btf, t->name_off),
> > +                       cc->cnt == 0 ? "has no matches" : "is ambiguous");
> > +               err = cc->cnt == 0 ? -ENOENT : -ESRCH;
> > +               goto cand_cache_unlock;
> > +       }
> > +       if (strcmp(cc->cands[0].btf->name, "vmlinux") != 0) {
>
> use btf_is_module() ?

oh, I was trying to find it and couldn't. Great, will use btf_is_module().

>
> > +               bpf_log(log, "arg#%d reference type('%s %s') points to kernel module type (unsupported)\n",
> > +                       arg_idx, btf_type_str(t), __btf_name_by_offset(btf, t->name_off));
> > +               err = -EOPNOTSUPP;
> > +               goto cand_cache_unlock;
> > +       }
> > +       kern_type_id = cc->cands[0].id;
> > +
> > +cand_cache_unlock:
> > +       mutex_unlock(&cand_cache_mutex);
> > +       if (err)
> > +               return err;
> > +
> > +       return kern_type_id;
> > +}
> > +
> >  enum btf_arg_tag {
> >         ARG_TAG_CTX = 0x1,
> >         ARG_TAG_NONNULL = 0x2,
> > +       ARG_TAG_TRUSTED = 0x4,
> > +       ARG_TAG_UNTRUSTED = 0x8,
> >  };
> >
> >  /* Process BTF of a function to produce high-level expectation of function
> > @@ -6906,6 +6975,10 @@ int btf_prepare_func_args(struct bpf_verifier_env *env, int subprog)
> >
> >                         if (strcmp(tag, "ctx") == 0) {
> >                                 tags |= ARG_TAG_CTX;
> > +                       } else if (strcmp(tag, "trusted") == 0) {
> > +                               tags |= ARG_TAG_TRUSTED;
> > +                       } else if (strcmp(tag, "untrusted") == 0) {
> > +                               tags |= ARG_TAG_UNTRUSTED;
> >                         } else if (strcmp(tag, "nonnull") == 0) {
> >                                 tags |= ARG_TAG_NONNULL;
> >                         } else {
> > @@ -6940,6 +7013,23 @@ int btf_prepare_func_args(struct bpf_verifier_env *env, int subprog)
> >                         sub->args[i].arg_type = ARG_PTR_TO_DYNPTR | MEM_RDONLY;
> >                         continue;
> >                 }
> > +               if (tags & (ARG_TAG_TRUSTED | ARG_TAG_UNTRUSTED)) {
> > +                       int kern_type_id;
> > +
> > +                       kern_type_id = btf_get_ptr_to_btf_id(log, i, btf, t);
> > +                       if (kern_type_id < 0)
> > +                               return kern_type_id;
> > +
> > +                       sub->args[i].arg_type = ARG_PTR_TO_BTF_ID | PTR_UNTRUSTED | PTR_MAYBE_NULL;
>
> PTR_MAYBE_NULL doesn't make sense for untrusted.
> It may be zero or -1 or 0xffff the bpf prog may or may not
> compare that with NULL or any other value.
> It doesn't affect safety and the verifier shouldn't be tracking
> null-ness of untrusted pointers. Just to avoid extra code and run-time.

The idea was to allow the caller to pass explicit NULL for such an
argument to say "no parameter", basically. I understand that at
runtime you can have non-zero actual value and exception handling code
will handle that.

>
> But more importantly...
> ARG_PTR_TO_BTF_ID with PTR_UNTRUSTED looks scary to me.
> base_type() trims flags and in many places we check
> base_type(arg) == ARG_PTR_TO_BTF_ID
> the rhs can come from helper defs in .arg1_type too.
> A lot of code need to be carefully audited to make sure
> we don't accidently introduce a safety issue because
> PTR_TO_BTF_ID | PTR_UNTRUSTED was added to btf_ptr_types[].
> The handling of it in check_reg_type() looks ok though...
>
> > @@ -8262,6 +8263,12 @@ static int check_reg_type(struct bpf_verifier_env *env, u32 regno,
> >         case PTR_TO_BTF_ID | MEM_PERCPU | PTR_TRUSTED:
> >                 /* Handled by helper specific checks */
> >                 break;
> > +       case PTR_TO_BTF_ID | PTR_UNTRUSTED:
> > +               if (!(arg_type & PTR_UNTRUSTED)) {
> > +                       verbose(env, "Passing unexpected untrusted pointer as arg#%d\n", regno);
> > +                       return -EACCES;
> > +               }
>
> both type and arg_type are untrusted, but I need to spend
> a ton more time thinking it through.
> Maybe avoid adding __arg_untrusted for now?
> The patch will be easier to understand. At least for me.
>

Ok, no problem. I added __arg_untrusted for completeness and I can see
people wanting to pass PTR_TO_BTF_ID they got from tracepoint or
whatnot. But __arg_trusted is more critical and can't be worked
around, while you can get effectively __arg_untrusted with passing
uintptr_t argument and then doing bpf_rdonly_cast(). Will drop
__arg_untrusted for now.