Re: [PATCH v2 bpf-next 09/13] libbpf: split BTF relocation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Apr 24, 2024 at 8:49 AM Alan Maguire <alan.maguire@xxxxxxxxxx> wrote:
>
> Map distilled base BTF type ids referenced in split BTF and their
> references to the base BTF passed in, and if the mapping succeeds,
> reparent the split BTF to the base BTF.
>
> Relocation rules are
>
> - base types must match exactly
> - enum[64] types should match all value name/value pairs, but the
>   to-be-relocated enum[64] can also define additional name/value pairs
> - an enum64 can match an enum and vice versa provided the values match
>   as described above
> - named fwds match to the correspondingly-named struct/union/enum/enum64
> - structs with no members match to the correspondingly-named struct/union
>   provided their sizes match
> - anon struct/unions must have field names/offsets specified in base
>   reference BTF matched by those in base BTF we are matching with
>
> Relocation can not recurse, since it will be used in-kernel also and
> we do not want to blow up the kernel stack when carrying out type
> compatibility checks.  Hence we use a stack for reference type
> relocation rather then recursive function calls.  The approach however
> is the same; we use a depth-first search to match the referents
> associated with reference types, and work back from there to match
> the reference type itself.
>
> Signed-off-by: Alan Maguire <alan.maguire@xxxxxxxxxx>
> ---
>  tools/lib/bpf/Build             |   2 +-
>  tools/lib/bpf/btf.c             |  58 +++
>  tools/lib/bpf/btf.h             |   8 +
>  tools/lib/bpf/btf_relocate.c    | 601 ++++++++++++++++++++++++++++++++
>  tools/lib/bpf/libbpf.map        |   1 +
>  tools/lib/bpf/libbpf_internal.h |   2 +
>  6 files changed, 671 insertions(+), 1 deletion(-)
>  create mode 100644 tools/lib/bpf/btf_relocate.c
>
> diff --git a/tools/lib/bpf/Build b/tools/lib/bpf/Build
> index b6619199a706..336da6844d42 100644
> --- a/tools/lib/bpf/Build
> +++ b/tools/lib/bpf/Build
> @@ -1,4 +1,4 @@
>  libbpf-y := libbpf.o bpf.o nlattr.o btf.o libbpf_errno.o str_error.o \
>             netlink.o bpf_prog_linfo.o libbpf_probes.o hashmap.o \
>             btf_dump.o ringbuf.o strset.o linker.o gen_loader.o relo_core.o \
> -           usdt.o zip.o elf.o features.o
> +           usdt.o zip.o elf.o features.o btf_relocate.o
> diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
> index 9036c1dc45d0..f00a84fea9b5 100644
> --- a/tools/lib/bpf/btf.c
> +++ b/tools/lib/bpf/btf.c
> @@ -5541,3 +5541,61 @@ int btf__distill_base(const struct btf *src_btf, struct btf **new_base_btf,
>         errno = -ret;
>         return ret;
>  }
> +
> +struct btf_rewrite_strs {
> +       struct btf *btf;
> +       const struct btf *old_base_btf;
> +       int str_start;
> +       int str_diff;
> +};
> +
> +static int btf_rewrite_strs(__u32 *str_off, void *ctx)
> +{
> +       struct btf_rewrite_strs *r = ctx;
> +       const char *s;
> +       int off;
> +
> +       if (!*str_off)
> +               return 0;
> +       if (*str_off >= r->str_start) {
> +               *str_off += r->str_diff;
> +       } else {
> +               s = btf__str_by_offset(r->old_base_btf, *str_off);
> +               if (!s)
> +                       return -ENOENT;
> +               off = btf__add_str(r->btf, s);
> +               if (off < 0)
> +                       return off;
> +               *str_off = off;
> +       }
> +       return 0;
> +}
> +
> +int btf_set_base_btf(struct btf *btf, struct btf *base_btf)
> +{
> +       struct btf_rewrite_strs r = {};
> +       struct btf_type *t;
> +       int i, err;
> +
> +       r.old_base_btf = btf__base_btf(btf);
> +       if (!r.old_base_btf)
> +               return -EINVAL;
> +       r.btf = btf;
> +       r.str_start = r.old_base_btf->hdr->str_len;
> +       r.str_diff = base_btf->hdr->str_len - r.old_base_btf->hdr->str_len;
> +       btf->base_btf = base_btf;
> +       btf->start_id = btf__type_cnt(base_btf);
> +       btf->start_str_off = base_btf->hdr->str_len;
> +       for (i = 0; i < btf->nr_types; i++) {
> +               t = (struct btf_type *)btf__type_by_id(btf, i + btf->start_id);

btf_type_by_id()

> +               err = btf_type_visit_str_offs(t, btf_rewrite_strs, &r);
> +               if (err)
> +                       break;
> +       }
> +       return err;
> +}
> +
> +int btf__relocate(struct btf *btf, const struct btf *base_btf)
> +{
> +       return btf_relocate(btf, base_btf, NULL);
> +}

[...]

> +               /* either names must match or both be anon. */
> +               if (t->name_off && nt->name_off) {
> +                       if (strcmp(btf__name_by_offset(r->btf, t->name_off),
> +                                  btf__name_by_offset(r->base_btf, nt->name_off)))
> +                               continue;
> +               } else if (t->name_off != nt->name_off) {
> +                       continue;
> +               }

btf__name_by_offset(0) return "", so you don't need this if/else
guard, just do strcmp()

> +               *tp = nt;
> +               *id = i;
> +               return 0;
> +       }
> +       return -ENOENT;
> +}
> +
> +static int btf_relocate_int(struct btf_relocate *r, const char *name,
> +                            const struct btf_type *t, const struct btf_type *bt)
> +{
> +       __u8 encoding, bencoding, bits, bbits;
> +
> +       if (t->size != bt->size) {
> +               pr_warn("INT types '%s' disagree on size; distilled base BTF says %d; base BTF says %d\n",
> +                       name, t->size, bt->size);
> +               return -EINVAL;
> +       }
> +       encoding = btf_int_encoding(t);
> +       bencoding = btf_int_encoding(bt);
> +       if (encoding != bencoding) {
> +               pr_warn("INT types '%s' disagree on encoding; distilled base BTF says '(%s/%s/%s); base BTF says '(%s/%s/%s)'\n",
> +                       name,
> +                       encoding & BTF_INT_SIGNED ? "signed" : "unsigned",
> +                       encoding & BTF_INT_CHAR ? "char" : "nonchar",
> +                       encoding & BTF_INT_BOOL ? "bool" : "nonbool",
> +                       bencoding & BTF_INT_SIGNED ? "signed" : "unsigned",
> +                       bencoding & BTF_INT_CHAR ? "char" : "nonchar",
> +                       bencoding & BTF_INT_BOOL ? "bool" : "nonbool");
> +               return -EINVAL;
> +       }
> +       bits = btf_int_bits(t);
> +       bbits = btf_int_bits(bt);
> +       if (bits != bbits) {

nit: this b* prefix is a bit ugly, maybe use enc/base_enc and bits/base_bits?

> +               pr_warn("INT types '%s' disagree on bit size; distilled base BTF says %d; base BTF says %d\n",
> +                       name, bits, bbits);
> +               return -EINVAL;
> +       }
> +       return 0;
> +}
> +
> +static int btf_relocate_float(struct btf_relocate *r, const char *name,
> +                              const struct btf_type *t, const struct btf_type *bt)
> +{
> +
> +       if (t->size != bt->size) {
> +               pr_warn("float types '%s' disagree on size; distilled base BTF says %d; base BTF says %d\n",
> +                       name, t->size, bt->size);
> +               return -EINVAL;
> +       }
> +       return 0;
> +}
> +
> +/* ensure each enum[64] value in type t has equivalent in base BTF and that
> + * values match; we must support matching enum64 to enum and vice versa
> + * as well as enum to enum and enum64 to enum64.
> + */
> +static int btf_relocate_enum(struct btf_relocate *r, const char *name,
> +                             const struct btf_type *t, const struct btf_type *bt)
> +{
> +       struct btf_enum *v = btf_enum(t);
> +       struct btf_enum *bv = btf_enum(bt);
> +       struct btf_enum64 *v64 = btf_enum64(t);
> +       struct btf_enum64 *bv64 = btf_enum64(bt);
> +       bool found, match, bisenum, isenum;

is_enum? bisenum is a bit too much to read without underscores (and
I'd still use base_ prefix)

> +       const char *vname, *bvname;
> +       __u32 name_off, bname_off;
> +       __u64 val = 0, bval = 0;
> +       int i, j;
> +

[...]

> +               if (!match) {
> +                       if (t->name_off)
> +                               pr_warn("ENUM[64] types '%s' disagree on enum value '%s'; distilled base BTF specifies value %lld; base BTF specifies value %lld\n",
> +                                       name, vname, val, bval);
> +                       return -EINVAL;
> +               }

What's the motivation to check enum values if we don't really do any
check like this for struct/union? It feels like just checking that
enum names and byte sizes match would be adequate, no?

I have similar feelings about INT checks, we assume the kernel module
was built against valid base BTF in the first place, so as long as
general memory layout matches, it should be OK to relocate. So I'd
stick to NAME + size checks.

If the kernel module was built with an enum definition that's not
compatible with the base kernel, it's a bigger problem than BTF. Just
like what we discussed with STRUCT/UNION.

> +       }
> +       return 0;
> +}
> +
> +/* relocate base types (int, float, enum, enum64 and fwd) */
> +static int btf_relocate_base_type(struct btf_relocate *r, __u32 id)
> +{
> +       const struct btf_type *t = btf_type_by_id(r->dist_base_btf, id);
> +       const char *name = btf__name_by_offset(r->dist_base_btf, t->name_off);
> +       const struct btf_type *bt = NULL;
> +       __u32 base_id = 0;
> +       int err = 0;
> +
> +       switch (btf_kind(t)) {
> +       case BTF_KIND_INT:
> +       case BTF_KIND_ENUM:
> +       case BTF_KIND_FLOAT:
> +       case BTF_KIND_ENUM64:
> +       case BTF_KIND_FWD:
> +               break;
> +       default:
> +               return 0;

why this is not an error?

> +       }
> +
> +       if (r->map[id] <= BTF_MAX_NR_TYPES)
> +               return 0;
> +
> +       while ((err = btf_relocate_find_next(r, t, &base_id, &bt)) != -ENOENT) {
> +               bt = btf_type_by_id(r->base_btf, base_id);
> +               switch (btf_kind(t)) {
> +               case BTF_KIND_INT:
> +                       err = btf_relocate_int(r, name, t, bt);
> +                       break;
> +               case BTF_KIND_ENUM:
> +               case BTF_KIND_ENUM64:
> +                       err = btf_relocate_enum(r, name, t, bt);
> +                       break;
> +               case BTF_KIND_FLOAT:
> +                       err = btf_relocate_float(r, name, t, bt);
> +                       break;
> +               case BTF_KIND_FWD:
> +                       err = 0;
> +                       break;
> +               default:
> +                       return 0;
> +               }
> +               if (!err) {
> +                       r->map[id] = base_id;
> +                       return 0;
> +               }
> +       }

I'm apprehensive of this linear search (many times) over vmlinux BTF,
it feels slow and sloppy, tbh

What if we mandate that distilled base BTF should be sorted by (kind,
name) by pahole/libbpf (which is simple enough to do), and then we can
do a single linear pass over vmlinux BTF + quick binary search over
distilled base BTF, marking (on the side) which base distilled BTF
type was processed. Then keep a pointer of processed distilled base
BTF types, and if at the end it doesn't match base distilled BTF
number of types, we couldn't relocate some of base types.

Simple and fast, WDYT? Or if we don't want to make pahole/libbpf sort,
we can build *distilled base* index cheaply in memory, and do
effectively the same (that's perhaps a bit more robust, but I think we
can just say that distilled base has to be sorted).

For STRUCT/UNION we'd need to do two searches, once for FWD+name and
if not found (embedded struct/union case) STRUCT/UNION+name, but
that's still fast with two binary searches.

A lot of the code below would go away (once we keep only named types
in distilled base), so I didn't spend much time reviewing it, sorry.

> +       return err;
> +}
> +

[...]





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux