On Fri, May 17, 2024 at 3:23 AM Alan Maguire <alan.maguire@xxxxxxxxxx> wrote: > > To support more robust split BTF, adding supplemental context for the > base BTF type ids that split BTF refers to is required. Without such > references, a simple shuffling of base BTF type ids (without any other > significant change) invalidates the split BTF. Here the attempt is made > to store additional context to make split BTF more robust. > > This context comes in the form of distilled base BTF providing minimal > information (name and - in some cases - size) for base INTs, FLOATs, > STRUCTs, UNIONs, ENUMs and ENUM64s along with modified split BTF that > points at that base and contains any additional types needed (such as > TYPEDEF, PTR and anonymous STRUCT/UNION declarations). This > information constitutes the minimal BTF representation needed to > disambiguate or remove split BTF references to base BTF. The rules > are as follows: > > - INT, FLOAT are recorded in full. > - if a named base BTF STRUCT or UNION is referred to from split BTF, it > will be encoded either as a zero-member sized STRUCT/UNION (preserving > size for later relocation checks) or as a named FWD. Only base BTF > STRUCT/UNIONs that are either embedded in split BTF STRUCT/UNIONs or > that have multiple STRUCT/UNION instances of the same name need to > preserve size information, so a FWD representation will be used in > most cases. > - if an ENUM[64] is named, a ENUM forward representation (an ENUM > with no values) is used. > - in all other cases, the type is added to the new split BTF. > > Avoiding struct/union/enum/enum64 expansion is important to keep the > distilled base BTF representation to a minimum size. > > When successful, new representations of the distilled base BTF and new > split BTF that refers to it are returned. Both need to be freed by the > caller. > > So to take a simple example, with split BTF with a type referring > to "struct sk_buff", we will generate distilled base BTF with a > FWD struct sk_buff, and the split BTF will refer to it instead. > > Tools like pahole can utilize such split BTF to populate the .BTF > section (split BTF) and an additional .BTF.base section. Then > when the split BTF is loaded, the distilled base BTF can be used > to relocate split BTF to reference the current (and possibly changed) > base BTF. > > So for example if "struct sk_buff" was id 502 when the split BTF was > originally generated, we can use the distilled base BTF to see that > id 502 refers to a "struct sk_buff" and replace instances of id 502 > with the current (relocated) base BTF sk_buff type id. > > Distilled base BTF is small; when building a kernel with all modules > using distilled base BTF as a test, ovreall module size grew by only typo: overall > 5.3Mb total across ~2700 modules. > > Signed-off-by: Alan Maguire <alan.maguire@xxxxxxxxxx> > --- > tools/lib/bpf/btf.c | 409 ++++++++++++++++++++++++++++++++++++++- > tools/lib/bpf/btf.h | 20 ++ > tools/lib/bpf/libbpf.map | 1 + > 3 files changed, 424 insertions(+), 6 deletions(-) > > diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c > index 2d0840ef599a..953929d196c3 100644 > --- a/tools/lib/bpf/btf.c > +++ b/tools/lib/bpf/btf.c > @@ -1771,9 +1771,8 @@ static int btf_rewrite_str(__u32 *str_off, void *ctx) > return 0; > } > > -int btf__add_type(struct btf *btf, const struct btf *src_btf, const struct btf_type *src_type) > +static int btf_add_type(struct btf_pipe *p, const struct btf_type *src_type) > { > - struct btf_pipe p = { .src = src_btf, .dst = btf }; > struct btf_type *t; > int sz, err; > > @@ -1782,20 +1781,27 @@ int btf__add_type(struct btf *btf, const struct btf *src_btf, const struct btf_t > return libbpf_err(sz); > > /* deconstruct BTF, if necessary, and invalidate raw_data */ > - if (btf_ensure_modifiable(btf)) > + if (btf_ensure_modifiable(p->dst)) > return libbpf_err(-ENOMEM); > > - t = btf_add_type_mem(btf, sz); > + t = btf_add_type_mem(p->dst, sz); > if (!t) > return libbpf_err(-ENOMEM); > > memcpy(t, src_type, sz); > > - err = btf_type_visit_str_offs(t, btf_rewrite_str, &p); > + err = btf_type_visit_str_offs(t, btf_rewrite_str, p); > if (err) > return libbpf_err(err); > > - return btf_commit_type(btf, sz); > + return btf_commit_type(p->dst, sz); > +} > + > +int btf__add_type(struct btf *btf, const struct btf *src_btf, const struct btf_type *src_type) > +{ > + struct btf_pipe p = { .src = src_btf, .dst = btf }; > + > + return btf_add_type(&p, src_type); > } > > static int btf_rewrite_type_ids(__u32 *type_id, void *ctx) > @@ -5212,3 +5218,394 @@ int btf_ext_visit_str_offs(struct btf_ext *btf_ext, str_off_visit_fn visit, void > > return 0; > } > + > +#define BTF_NEEDS_SIZE (1 << 31) /* flag set if either struct/union is > + * embedded - and thus size info must > + * be preserved - or if there are > + * multiple instances of the same > + * struct/union - where size can be > + * used to clarify which is wanted. > + */ please use full line comment in front of #define > +#define BTF_ID(id) (id & ~BTF_NEEDS_SIZE) > + > +struct btf_distill { > + struct btf_pipe pipe; > + int *ids; reading the rest of the code, this BTF_NEEDS_SIZE and BTF_ID() macro use was quite distracting. I'm wondering if it would be better to use a simple struct with bitfields here, e.g., struct type_state { int id: 31; bool needs_size; }; struct dist_state *states; Same memory usage, same efficiency, but more readable code. WDYT? > + unsigned int split_start_id; > + unsigned int split_start_str; > + unsigned int diff_id; > +}; > + > +/* Check if a member of a split BTF struct/union refers to a base BTF nit: comments uses "check" terminology, function name uses "find", while really it "marks" some time as embedded... Let's use consistent terminology (where mark seems like the most fitting, IMO) > + * struct/union. Members can be const/restrict/volatile/typedef > + * reference types, but if a pointer is encountered, type is no longer > + * considered embedded. > + */ > +static int btf_find_embedded_composite_type_ids(__u32 *id, void *ctx) > +{ > + struct btf_distill *dist = ctx; > + const struct btf_type *t; > + __u32 next_id = *id; > + > + do { > + if (next_id == 0) > + return 0; > + t = btf_type_by_id(dist->pipe.src, next_id); > + switch (btf_kind(t)) { > + case BTF_KIND_CONST: > + case BTF_KIND_RESTRICT: > + case BTF_KIND_VOLATILE: > + case BTF_KIND_TYPEDEF: > + case BTF_KIND_TYPE_TAG: > + next_id = t->type; > + break; > + case BTF_KIND_ARRAY: { > + struct btf_array *a = btf_array(t); > + > + next_id = a->type; > + break; > + } > + case BTF_KIND_STRUCT: > + case BTF_KIND_UNION: > + dist->ids[next_id] |= BTF_NEEDS_SIZE; > + return 0; > + default: > + return 0; > + } > + > + } while (1); nit: while (true) > + > + return 0; > +} > + > +/* Check if composite type has a duplicate-named type; if it does, retain see above about check vs mark, here at least the function name uses "mark" :) > + * size information to help guide later relocation towards the correct type. > + * For example there are duplicate 'dma_chan' structs in vmlinux BTF; > + * one is size 112, the other 16. Having size information allows relocation > + * to choose the right one. re: this dma_chan and similar cases. Is it ever a problem where a module actually needs two dma_chan in distilled base BTF? Name conflicts do happen, but intuitively I'd expect this to happen between some vmlinux-internal (so to speak, i.e., not the type used in exported functions/types) and kernel module-specific types. It would be awkward for the same module to use two different types that are named the same. Have you seen this as a problem in practice? What if for now we just error out if there are two conflicting types required in distilled BTF? > + */ > +static int btf_mark_composite_dups(struct btf_distill *dist, __u32 id) > +{ > + __u8 *cnt = calloc(dist->split_start_str, sizeof(__u8)); nit: we generally follow that initialization of variable shouldn't be doing non-trivial operations. So let's do calloc() as a separate statement outside of variable declaration. > + struct btf_type *t; > + int i; > + > + if (!cnt) > + return -ENOMEM; > + > + /* First pass; collect name counts for composite types. */ > + for (i = 1; i < dist->split_start_id; i++) { > + t = btf_type_by_id(dist->pipe.src, i); > + if (!btf_is_composite(t) || !t->name_off) > + continue; > + if (cnt[t->name_off] < 255) > + cnt[t->name_off]++; > + } > + /* Second pass; mark composite types with multiple instances of the > + * same name as needing size information. > + */ > + for (i = 1; i < dist->split_start_id; i++) { > + /* id not needed or is already preserving size information */ > + if (!dist->ids[i] || (dist->ids[i] & BTF_NEEDS_SIZE)) > + continue; > + t = btf_type_by_id(dist->pipe.src, i); > + if (!btf_is_composite(t) || !t->name_off) > + continue; > + if (cnt[t->name_off] > 1) > + dist->ids[i] |= BTF_NEEDS_SIZE; > + } > + free(cnt); > + > + return 0; > +} > + > +static bool btf_is_eligible_named_fwd(const struct btf_type *t) > +{ > + return (btf_is_composite(t) || btf_is_any_enum(t)) && t->name_off != 0; > +} > + > +static int btf_add_distilled_type_ids(__u32 *id, void *ctx) > +{ > + struct btf_distill *dist = ctx; > + struct btf_type *t = btf_type_by_id(dist->pipe.src, *id); > + int err; > + > + if (!*id) > + return 0; > + /* split BTF id, not needed */ > + if (*id >= dist->split_start_id) > + return 0; > + /* already added ? */ > + if (BTF_ID(dist->ids[*id]) > 0) > + return 0; > + > + /* only a subset of base BTF types should be referenced from split > + * BTF; ensure nothing unexpected is referenced. > + */ > + switch (btf_kind(t)) { > + case BTF_KIND_INT: > + case BTF_KIND_FLOAT: > + case BTF_KIND_FWD: > + case BTF_KIND_ARRAY: > + case BTF_KIND_STRUCT: > + case BTF_KIND_UNION: > + case BTF_KIND_TYPEDEF: > + case BTF_KIND_ENUM: > + case BTF_KIND_ENUM64: > + case BTF_KIND_PTR: > + case BTF_KIND_CONST: > + case BTF_KIND_RESTRICT: > + case BTF_KIND_VOLATILE: > + case BTF_KIND_FUNC_PROTO: > + case BTF_KIND_TYPE_TAG: > + dist->ids[*id] |= *id; > + break; > + default: > + pr_warn("unexpected reference to base type[%u] of kind [%u] when creating distilled base BTF.\n", > + *id, btf_kind(t)); > + return -EINVAL; > + } > + > + /* struct/union members not needed, except for anonymous structs > + * and unions, which we need since name won't help us determine > + * matches; so if a named struct/union, no need to recurse > + * into members. > + */ is this an outdated comment? we shouldn't have anonymous types in the distilled base, right? > + if (btf_is_eligible_named_fwd(t)) > + return 0; > + > + /* ensure references in type are added also. */ > + err = btf_type_visit_type_ids(t, btf_add_distilled_type_ids, ctx); > + if (err < 0) > + return err; > + return 0; nit: could be just `return btf_type_visit_type_ids(...);`? > +} > + > +static int btf_add_distilled_types(struct btf_distill *dist) > +{ > + bool adding_to_base = dist->pipe.dst->start_id == 1; > + int id = btf__type_cnt(dist->pipe.dst); > + struct btf_type *t; > + int i, err = 0; > + > + /* Add types for each of the required references to either distilled > + * base or split BTF, depending on type characteristics. > + */ > + for (i = 1; i < dist->split_start_id; i++) { > + const char *name; > + int kind; > + > + if (!BTF_ID(dist->ids[i])) > + continue; > + t = btf_type_by_id(dist->pipe.src, i); > + kind = btf_kind(t); > + name = btf__name_by_offset(dist->pipe.src, t->name_off); > + > + /* Named int, float, fwd struct, union, enum[64] are added to > + * base; everything else is added to split BTF. > + */ > + switch (kind) { > + case BTF_KIND_INT: > + case BTF_KIND_FLOAT: > + case BTF_KIND_FWD: > + case BTF_KIND_STRUCT: > + case BTF_KIND_UNION: > + case BTF_KIND_ENUM: > + case BTF_KIND_ENUM64: > + if ((adding_to_base && !t->name_off) || (!adding_to_base && t->name_off)) > + continue; > + break; > + default: > + if (adding_to_base) > + continue; > + break; > + } > + if (dist->ids[i] & BTF_NEEDS_SIZE) { > + /* If a named struct/union in base BTF is referenced as a type > + * in split BTF without use of a pointer - i.e. as an embedded > + * struct/union - add an empty struct/union preserving size > + * since size must be consistent when relocating split and > + * possibly changed base BTF. Similarly, when a struct/union > + * has multiple instances of the same name in the original > + * base BTF, retain size to help relocation later pick the > + * right struct/union. > + */ > + err = btf_add_composite(dist->pipe.dst, kind, name, t->size); > + } else if (btf_is_eligible_named_fwd(t)) { > + /* If not embedded, use a fwd for named struct/unions since we > + * can match via name without any other details. > + */ > + switch (kind) { > + case BTF_KIND_STRUCT: > + err = btf__add_fwd(dist->pipe.dst, name, BTF_FWD_STRUCT); > + break; > + case BTF_KIND_UNION: > + err = btf__add_fwd(dist->pipe.dst, name, BTF_FWD_UNION); > + break; > + case BTF_KIND_ENUM: > + err = btf__add_enum(dist->pipe.dst, name, t->size); > + break; > + case BTF_KIND_ENUM64: nit: combine ENUM/ENUM64 cases? > + err = btf__add_enum(dist->pipe.dst, name, t->size); > + break; > + default: > + pr_warn("unexpected kind [%u] when creating distilled base BTF.\n", > + btf_kind(t)); > + return -EINVAL; > + } > + } else { > + err = btf_add_type(&dist->pipe, t); So this should never happen if adding_to_base == true, is that right? Can we check this? Or am I missing something as usual? > + } > + if (err < 0) > + break; > + dist->ids[i] = id++; > + } > + return err; > +} > + [...] > + n = btf__type_cnt(new_split); > + /* Now update base/split BTF ids. */ > + for (i = 1; i < n; i++) { > + t = btf_type_by_id(new_split, i); > + > + err = btf_type_visit_type_ids(t, btf_update_distilled_type_ids, &dist); > + if (err < 0) > + goto err_out; > + } > + free(dist.ids); > + hashmap__free(dist.pipe.str_off_map); > + *new_base_btf = new_base; > + *new_split_btf = new_split; > + return 0; > +err_out: > + free(dist.ids); > + if (!IS_ERR(dist.pipe.str_off_map)) you don't need to check this, hashmap__free() works for IS_ERR() pointers as well > + hashmap__free(dist.pipe.str_off_map); > + btf__free(new_split); > + btf__free(new_base); > + return libbpf_err(err); > +} [...]