On Tue, 2022-11-01 at 17:08 +0000, Alan Maguire wrote: > On 31/10/2022 15:49, Eduard Zingerman wrote: > > On Thu, 2022-10-27 at 15:07 -0700, Andrii Nakryiko wrote: > > > On Tue, Oct 25, 2022 at 3:28 PM Eduard Zingerman <eddyz87@xxxxxxxxx> wrote: > > [...] > > > > + > > > > +/* > > > > + * Collect a `name_off_map` that maps type names to type ids for all > > > > + * canonical structs and unions. If the same name is shared by several > > > > + * canonical types use a special value 0 to indicate this fact. > > > > + */ > > > > +static int btf_dedup_fill_unique_names_map(struct btf_dedup *d, struct hashmap *names_map) > > > > +{ > > > > + int i, err = 0; > > > > + __u32 type_id, collision_id; > > > > + __u16 kind; > > > > + struct btf_type *t; > > > > + > > > > + for (i = 0; i < d->btf->nr_types; i++) { > > > > + type_id = d->btf->start_id + i; > > > > + t = btf_type_by_id(d->btf, type_id); > > > > + kind = btf_kind(t); > > > > + > > > > + if (kind != BTF_KIND_STRUCT && kind != BTF_KIND_UNION) > > > > + continue; > > > > > > let's also do ENUM FWD resolution. ENUM FWD is just ENUM with vlen=0 > > > > Interestingly this is necessary only for mixed enum / enum64 case. > > Forward enum declarations are resolved by bpf/btf.c:btf_dedup_prim_type: > > > > Ah, great catch! A forward can look like an enum to one CU but another CU can > specify values that make it an enum64. > > > case BTF_KIND_ENUM: > > h = btf_hash_enum(t); > > for_each_dedup_cand(d, hash_entry, h) { > > cand_id = (__u32)(long)hash_entry->value; > > cand = btf_type_by_id(d->btf, cand_id); > > if (btf_equal_enum(t, cand)) { > > new_id = cand_id; > > break; > > } > > if (btf_compat_enum(t, cand)) { > > if (btf_is_enum_fwd(t)) { > > /* resolve fwd to full enum */ > > new_id = cand_id; > > break; > > } > > /* resolve canonical enum fwd to full enum */ > > d->map[cand_id] = type_id; > > } > > } > > break; > > // ... similar logic for ENUM64 ... > > > > - btf_hash_enum ignores vlen when hashing; > > - btf_compat_enum compares only names and sizes. > > > > So, if forward and main declaration kinds match (either BTF_KIND_ENUM > > or BTF_KIND_ENUM64) the forward declaration would be removed. But if > > the kinds are different the forward declaration would remain. E.g.: > > > > CU #1: > > enum foo; > > enum foo *a; > > > > CU #2: > > enum foo { x = 0xfffffffff }; > > enum foo *b; > > > > BTF: > > [1] ENUM64 'foo' encoding=UNSIGNED size=8 vlen=1 > > 'x' val=68719476735ULL > > [2] INT 'long unsigned int' size=8 bits_offset=0 nr_bits=64 encoding=(none) > > [3] PTR '(anon)' type_id=1 > > [4] ENUM 'foo' encoding=UNSIGNED size=4 vlen=0 > > [5] PTR '(anon)' type_id=4 > > > > BTF_KIND_FWDs are unified during btf_dedup_struct_types but enum > > forward declarations are not. So it would be incorrect to add enum > > forward declaration unification logic to btf_dedup_resolve_fwds, > > because the following case would not be covered: > > > > CU #1: > > enum foo; > > struct s { enum foo *a; } *a; > > > > CU #2: > > enum foo { x = 0xfffffffff }; > > struct s { enum foo *a; } *b; > > > > Currently STRUCTs 's' are not de-duplicated. > > > > What if CU#1 is in base BTF and CU#2 in split module BTF? I think we'd explicitly > want to avoid deduping "struct s" then since we can't be sure that it is the > same enum they are pointing at. That's the logic we employ for structs at > least, based upon the rationale that we can't feed back knowledge of types > from module to kernel BTF since the latter is now fixed (Andrii, do correct me > if I have this wrong). In such a case the enum is no longer standalone; it > serves the purpose of allowing us to define a pointer to a module-specific > type. We recently found some examples of this sort of thing with structs, > where the struct was defined in module BTF, making dedup fail for some core > kernel data types, but the problem was restricted to modules which _did_ > define the type so wasn't a major driver of dedup failures. Not sure if > there's many (any?) enum cases of this in practice. Hi Alan, As far as I understand the loop in `btf_dedup_prim_types` guarantees that only ids from the split module would be remapped: struct btf { ... /* BTF type ID of the first type in this BTF instance: * - for base BTF it's equal to 1; * - for split BTF it's equal to biggest type ID of base BTF plus 1. */ int start_id; ... } ... for (i = 0; i < d->btf->nr_types; i++) { err = btf_dedup_prim_type(d, d->btf->start_id + i); if (err) return err; } Thus CU1:foo won't be updated to be CU2:foo and CU1:s will not be the same as CU2:s. Is that right or am I confused? Thanks, Eduard > > I suppose if we could guarantee the dedup happened within the same object > (kernel or module) we could relax this constraint though? > > Alan