Re: [RFC bpf-next 09/13] libbpf: split BTF reconciliation

Alan Maguire <alan.maguire@xxxxxxxxxx> · Fri, 5 Apr 2024 11:06:34 +0100

On 29/03/2024 22:01, Andrii Nakryiko wrote:
> On Fri, Mar 22, 2024 at 3:26 AM Alan Maguire <alan.maguire@xxxxxxxxxx> wrote:
>>
>> Map base reference BTF type ids referenced in split BTF and their
>> references to the base BTF passed in, and if the mapping succeeds,
>> reparent the split BTF to the base BTF.
>>
>> Reconciliation rules are
>>
>> - base types must match exactly
>> - enum[64] types should match all value name/value pairs, but the
>>   to-be-reconciled enum[64] can also define additional name/value pairs
>> - named fwds match to the correspondingly-named struct/union/enum/enum64
> 
> yeah, but what about their size if they are embedded? Using FWD was a
> nice trick, but it's not flexible enough for recording (optionally)
> size... Probably emitting an empty (but named) struct/union/enum would
> be a bit better (and actually make split BTF using base ref more valid
> even without pre-processing).
> 
>> - anon struct/unions must have field names/offsets specified in base
>>   reference BTF matched by those in base BTF we are matching with
>>
>> Reconciliation can not recurse, since it will be used in-kernel also and
>> we do not want to blow up the kernel stack when carrying out type
>> compatibility checks.  Hence we use a stack for reference type
>> reconciliation rather then recursive function calls.
>>
>> Signed-off-by: Alan Maguire <alan.maguire@xxxxxxxxxx>
>> ---
>>  tools/lib/bpf/Build             |   2 +-
>>  tools/lib/bpf/btf.c             |  58 ++++
>>  tools/lib/bpf/btf.h             |   8 +
>>  tools/lib/bpf/btf_reconcile.c   | 590 ++++++++++++++++++++++++++++++++
> 
> how wrong would it be to call this process "relocate" rather than "reconcile"?
>

seems fine to me.

>>  tools/lib/bpf/libbpf.map        |   1 +
>>  tools/lib/bpf/libbpf_internal.h |   2 +
>>  6 files changed, 660 insertions(+), 1 deletion(-)
>>  create mode 100644 tools/lib/bpf/btf_reconcile.c
>>
> 
> [...]
> 
>> +/* Find next type after *id in base BTF that matches kind of type t passed in
>> + * and name (if it is specified).  Match fwd kinds to appropriate kind also.
>> + */
>> +static int btf_reconcile_find_next(struct btf_reconcile *r, const struct btf_type *t,
>> +                                  __u32 *id, const struct btf_type **tp)
> 
> I haven't grokked the whole patch logic just yet, doing a first pass,
> so I might be asking stupid questions, sorry.
> 
> But it looks like we have these linear searches here to find matching
> types, is that right? Wouldn't it be better to build an index first to
> speed up search?
> 

It would, but the aim here was to keep things simple with an eye to
sharing the code with the kernel. A lot of the libbpf hash stuff would
be handy but then we'd have to have something on the kernel side. Given
that the size of the base BTF is so small, the linear searches aren't
much of a cost.

>> +{
>> +       const struct btf_type *nt;
>> +       int kind, tkind = btf_kind(t);
>> +       int tkflag = btf_kflag(t);
>> +       __u32 i;
>> +
> 
> [...]
> 
>> +static int btf_reconcile_int(struct btf_reconcile *r, const char *name,
>> +                            const struct btf_type *t, const struct btf_type *bt)
>> +{
>> +       __u32 *info = (__u32 *)(t + 1);
>> +       __u32 *binfo = (__u32 *)(bt + 1);
>> +
>> +       if (t->size != bt->size) {
>> +               pr_warn("INT types '%s' disagree on size; reference base BTF says %d; base BTF says %d\n",
>> +                       name, t->size, bt->size);
>> +               return -EINVAL;
>> +       }
>> +       if (BTF_INT_ENCODING(*info) != BTF_INT_ENCODING(*binfo)) {
>> +               pr_warn("INT types '%s' disagree on encoding; reference base BTF says '(%s/%s/%s); base BTF says '(%s/%s/%s)'\n",
>> +                       name,
>> +                       BTF_INT_ENCODING(*info) & BTF_INT_SIGNED ? "signed" : "unsigned",
>> +                       BTF_INT_ENCODING(*info) & BTF_INT_CHAR ? "char" : "nonchar",
>> +                       BTF_INT_ENCODING(*info) & BTF_INT_BOOL ? "bool" : "nonbool",
>> +                       BTF_INT_ENCODING(*binfo) & BTF_INT_SIGNED ? "signed" : "unsigned",
>> +                       BTF_INT_ENCODING(*binfo) & BTF_INT_CHAR ? "char" : "nonchar",
>> +                       BTF_INT_ENCODING(*binfo) & BTF_INT_BOOL ? "bool" : "nonbool");
> 
> there is btf_int_encoding() helper, please use it
>
will do, and for others below too.

>> +               return -EINVAL;
>> +       }
>> +       if (BTF_INT_BITS(*info) != BTF_INT_BITS(*binfo)) {
> 
> btf_int_bits()
> 
>> +               pr_warn("INT types '%s' disagree on bit size; reference base BTF says %d; base BTF says %d\n",
>> +                       name, BTF_INT_BITS(*info), BTF_INT_BITS(*binfo));
>> +               return -EINVAL;
>> +       }
>> +       return 0;
>> +}
>> +
>> +static int btf_reconcile_float(struct btf_reconcile *r, const char *name,
>> +                              const struct btf_type *t, const struct btf_type *bt)
>> +{
>> +
>> +       if (t->size != bt->size) {
>> +               pr_warn("float types '%s' disagree on size; reference base BTF says %d; base BTF says %d\n",
>> +                       name, t->size, bt->size);
>> +               return -EINVAL;
>> +       }
>> +       return 0;
>> +}
>> +
>> +/* Ensure each enum value in type t has equivalent in base BTF and that values (if any) match. */
>> +static int btf_reconcile_enum(struct btf_reconcile *r, const char *name,
>> +                             const struct btf_type *t, const struct btf_type *bt)
>> +{
> 
> should we care about compatibility between ENUM and ENUM64, they can
> both represent the same values of the same size?
> 

so do you mean if one representation uses an enum, another an enum64?
Yep that might well be the case, I'll add that too.

>> +       struct btf_enum *v = (struct btf_enum *)(t + 1);
>> +       struct btf_enum *bv = (struct btf_enum *)(bt + 1);
>> +       bool found, match;
>> +       int i, j;
>> +
>> +       for (i = 0; i < btf_vlen(t); i++, v++) {
>> +               found = match = false;
>> +
>> +               if (!v->name_off)
>> +                       continue;
>> +               for (j = 0; j < btf_vlen(bt); j++, bv++) {
>> +                       if (!bv->name_off)
>> +                               continue;
>> +                       if (strcmp(btf__name_by_offset(r->base_ref_btf, v->name_off),
>> +                                  btf__name_by_offset(r->base_btf, bv->name_off)) != 0)
> 
> nit: please use local vars to shorten these long if conditions, it
> will be easier to read and follow
>

will do.

>> +                               continue;
>> +                       found = true;
>> +                       match = (v->val == bv->val);
>> +                       break;
>> +               }
> 
> [...]
> 
>> +       while ((err = btf_reconcile_find_next(r, t, &base_id, &bt)) != -ENOENT) {
>> +               bt = btf_type_by_id(r->base_btf, base_id);
>> +               switch (btf_kind(t)) {
>> +               case BTF_KIND_INT:
>> +                       err = btf_reconcile_int(r, name, t, bt);
>> +                       break;
>> +               case BTF_KIND_ENUM:
>> +                       err = btf_reconcile_enum(r, name, t, bt);
>> +                       break;
>> +               case BTF_KIND_FLOAT:
>> +                       err = btf_reconcile_float(r, name, t, bt);
>> +                       break;
>> +               case BTF_KIND_ENUM64:
>> +                       err = btf_reconcile_enum64(r, name, t, bt);
>> +                       break;
>> +               case BTF_KIND_FWD:
> 
> well, FWD can be for enum vs struct, gotta check that
> 
>> +                       err = 0;
>> +                       break;
>> +               default:
>> +                       return 0;
>> +               }
>> +               if (!err) {
>> +                       r->map[id] = base_id;
>> +                       return 0;
>> +               }
>> +       }
>> +       return err;
>> +}
> 
> [...]
> 
>> +                               if (err) {
>> +                                       pr_warn("could not find base BTF type for base reference type[%u]\n",
>> +                                               id);
>> +                                       return err;
>> +                               }
>> +                       } else {
>> +                               if (btf_reconcile_push(r, id) < 0 ||
>> +                                   btf_reconcile_push(r, t->type) < 0)
>> +                                       return -ENOSPC;
> 
> I'm missing something, please help me understand. I don't get why we
> need a recursive algorithm at all.
> 
> In my mind, we have this small "base ref" set of types referenced from
> module's BTF (split BTF part), right? So all we should need is to map
> every type from base ref set to vmlinux BTF.
> 
> What I don't yet fully get is why CONST/VOLATILE or PTR need to
> postpone reconciliation via a queue. By the time we get to types in
> split BTF all base ref types should be mapped, so all you need is to
> remap t->type to resolved vmlinux BTF, no?
> 

It's possible to have multiple layers of reference in the distilled base
BTF though; for example here's a case from a module's distilled base BTF:

[41] PTR '(anon)' type_id=42
[42] CONST '(anon)' type_id=0

To resolve type id 41 we need to resolve type id 42, and since type id 0
already has a mapping, at that point we can look for CONSTs that refer
to type id 0 and then once we've established that mapping we can find
PTRs that have a t->type that refers to the const. So as described below
we keep pushing type ids onto the stack until we find one with a t->type
mapping to base BTF; once we hit that we can start looking in base BTF
for types that have the mapped t->type value.

> I suspect the answer might have something to do with those anonymous
> structs/unions which you copy verbatim into base ref BTF?
>

They do add a few more types, but we can get base BTF references that
don't come from that source too. The above case PTR CONST void for
example wasn't referred to via any other distilled base types.

> But on the latter topic, I wonder if we at all need this? Why not keep
> all those anon struct/union/enum in module's part of BTF? If they are
> unnamed, I doubt they will ever be referenced from kfuncs or anything
> like that, so their BTF ID isn't that important.
> 

Changing the module BTF would make things a bit more complicated.
Currently we just update type id references and string offsets. The anon
structs we end up with tend to be very small; from the same distilled
base BTF used above here are the instances of struct/union:

[35] STRUCT '(anon)' size=4 vlen=1
        'counter' type_id=6 bits_offset=0
[62] STRUCT '(anon)' size=8 vlen=1
        'raw_lock' type_id=48 bits_offset=0
[113] UNION '(anon)' size=8 vlen=2
        'kernel' type_id=13 bits_offset=0
        'user' type_id=13 bits_offset=0
[114] STRUCT '(anon)' size=16 vlen=2
        '(anon)' type_id=113 bits_offset=0
        'is_kernel' type_id=11 bits_offset=64 bitfield_size=1
[119] STRUCT '(anon)' size=8 vlen=1
        'net' type_id=85 bits_offset=0

These only added one type that wasn't referenced elsewhere - typedef
arch_rwlock_t.

> If base BTF is all named types, it would simplify the reconciliation
> process significantly, I think.
> 
> But again, I only skimmed the overall algorithm, sorry for my
> laziness, but I figured it would be good to discuss the above first
> anyways.
>

I'll try and walk through an example of how the algorithm proceeds; that
might help make the approach concrete and we can see if it can be
simplified.

Consider split BTF that uses the base BTF type "int *", i.e. a PTR to an
"int". It could do so in an anonymous struct/union, but also as an array
member type or a FUNC_PROTO or VAR. For such a reference type, we first
encounter the outer PTR. If its t->type has a mapping to base BTF "int"
(it will), we look through the base BTF for reference types that match
the kind (here a PTR) _and_ have the required t->type (int). Once we
find the reference type in base BTF, we can add the mapping for PTR to
int from distilled->base BTF id. So the reference type with one layer of
reference is pretty straightforward.

In the case where we don't have a mapping for a t->type - let's say PTR
to PTR to int (int **) - we push the type id for PTR to PTR to int onto
the stack along with the type id for t->type (PTR to int) after. So we
will first pop PTR to int and go through the above-described type
resolution. Next we will pop the PTR to PTR to int and because we've
resolved PTR to int now, it's t->type will have a mapping and we can go
through the search process to find a PTR that refers to a PTR to int
in base BTF, and we then add that mapping too.

For an array we push the member and index types, a func proto the return
type and parameter types, etc.

So once multiple layers of reference are part of the picture I _think_
we need something like this approach.

Alan

>> +                       }
>> +                       break;
> 
> [...]