On 21/10/2022 19:37, Arnaldo Carvalho de Melo wrote: > Em Fri, Oct 21, 2022 at 09:35:50AM -0700, Andrii Nakryiko escreveu: >> On Fri, Oct 21, 2022 at 9:32 AM Jiri Olsa <olsajiri@xxxxxxxxx> wrote: >>> On Fri, Oct 21, 2022 at 04:02:03PM +0100, Alan Maguire wrote: >>>> BTF deduplication was throwing some strange results, where core kernel >>>> data types were failing to deduplicate due to the return values >>>> of function type members being void (0) instead of the actual type >>>> (unsigned int). An example of this can be seen below, where >>>> "struct dst_ops" was failing to deduplicate between kernel and >>>> module: >>>> >>>> struct dst_ops { >>>> short unsigned int family; >>>> unsigned int gc_thresh; >>>> int (*gc)(struct dst_ops *); >>>> struct dst_entry * (*check)(struct dst_entry *, __u32); >>>> unsigned int (*default_advmss)(const struct dst_entry *); >>>> unsigned int (*mtu)(const struct dst_entry *); >>>> ... >>>> >>>> struct dst_ops___2 { >>>> short unsigned int family; >>>> unsigned int gc_thresh; >>>> int (*gc)(struct dst_ops___2 *); >>>> struct dst_entry___2 * (*check)(struct dst_entry___2 *, __u32); >>>> void (*default_advmss)(const struct dst_entry___2 *); >>>> void (*mtu)(const struct dst_entry___2 *); >>>> ... >>>> >>>> This was seen with >>>> >>>> bcc648a10cbc ("btf_encoder: Encode DW_TAG_unspecified_type returning routines as void") >>>> >>>> ...which rewrites the return value as 0 (void) when it is marked >>>> as matching DW_TAG_unspecified_type: >>>> >>>> static int32_t btf_encoder__tag_type(struct btf_encoder *encoder, uint32_t type_id_off, uint32_t tag_type) >>>> { >>>> if (tag_type == 0) >>>> return 0; >>>> >>>> if (encoder->cu->unspecified_type.tag && tag_type == encoder->cu->unspecified_type.type) { >>>> // No provision for encoding this, turn it into void. >>>> return 0; >>>> } >>>> >>>> return type_id_off + tag_type; >>>> } >>>> >>>> However the odd thing was that on further examination, the unspecified type >>>> was not being set, so why was this logic being tripped? Futher debugging >>>> showed that the encoder->cu->unspecified_type.tag value was garbage, and >>>> the type id happened to collide with "unsigned int"; as a result we >>>> were replacing unsigned ints with void return values, and since this >>>> was being done to function type members in structs, it triggered a >>>> type mismatch which failed deduplication between kernel and module. >>>> >>>> The fix is simply to calloc() the cu in cu__new() instead. >>>> >>>> Fixes: bcc648a10cbc ("btf_encoder: Encode DW_TAG_unspecified_type returning routines as void") >>>> Signed-off-by: Alan Maguire <alan.maguire@xxxxxxxxxx> >>> >>> awesome, this fixes the missing dedup I was seeing >>> with current pahole: >>> >>> $ bpftool btf dump file ./vmlinux.test | grep "STRUCT 'task_struct'" | wc -l >>> 69 >>> >>> with this patch: >>> >>> $ bpftool btf dump file ./vmlinux.test | grep "STRUCT 'task_struct'" | wc -l >>> 1 >>> >> >> Nice and a great catch! I generally try to stick to calloc() in libbpf >> exactly so I don't have to worry about stuff like this. >> >> Acked-by: Andrii Nakryiko <andrii@xxxxxxxxxx> > >>> Acked-by: Jiri Olsa <jolsa@xxxxxxxxxx> > > Thanks, applied, my bad, I just changed it to zalloc(): > > ⬢[acme@toolbox pahole]$ grep -A3 'zalloc(size_t' dutil.c > void *zalloc(size_t size) > { > return calloc(1, size); > } > ⬢[acme@toolbox pahole]$ > > That is used in many places, but unfortunately not on this specific case > :-\ > Thanks for the quick turnaround Arnaldo, and thanks Jiri and Andrii for taking a look! For me, this fix alone didn't sort out all the dedup issues; when testing with gcc9/11 I also saw dedup failures which resulted in core kernel data structure duplication in modules that wasn't a result of this issue (or fwd resolution issues). It turned out to be due to identical struct definitions that weren't quite handled by the existing identical struct matching logic. I'm on final approach with a libbpf patch for that that works for me, but if it works for others too it might be worth resyncing the libbf version in dwarves to pull it in once it lands, as it seems to make a big difference in dedup. Thanks! Alan > - Arnaldo > > ⬢[acme@toolbox pahole]$ grep 'zalloc(' *.c > btf_encoder.c: struct btf_encoder *encoder = zalloc(sizeof(*encoder)); > btf_loader.c: struct tag *tag = zalloc(size); > btf_loader.c: struct class_member *member = zalloc(sizeof(*member)); > btf_loader.c: struct tag *tag = zalloc(sizeof(*tag)); > ctf_loader.c: struct tag *tag = zalloc(size); > ctf_loader.c: struct class_member *member = zalloc(sizeof(*member)); > ctf_loader.c: struct class_member *member = zalloc(sizeof(*member)); > ctf_loader.c: struct tag *tag = zalloc(sizeof(*tag)); > dutil.c:void *zalloc(size_t size) > dwarf_loader.c: struct dwarf_cu *dwarf_cu = cu__zalloc(cu, sizeof(*dwarf_cu)); > dwarf_loader.c: struct dwarf_tag *dtag = cu__zalloc(dcu->cu, (sizeof(*dtag) + (spec ? sizeof(dwarf_off_ref) : 0))); > dwarf_loader.c: struct tag *tag = cu__zalloc(dcu->cu, size); > dwarf_loader.c: struct type *new_typedef = cu__zalloc(cu, sizeof(*new_typedef)); > dwarf_loader.c: recoded = cu__zalloc(cu, sizeof(*recoded)); > dwarf_loader.c: struct base_type *new_bt = cu__zalloc(cu, sizeof(*new_bt)); > dwarf_loader.c: struct type *new_enum = cu__zalloc(cu, sizeof(*new_enum)); > dwarf_loader.c: annot = zalloc(sizeof(*annot)); > dwarf_loader.c: dcu = zalloc(sizeof(*dcu)); > dwarves.c:static void *obstack_zalloc(struct obstack *obstack, size_t size) > dwarves.c:void *cu__zalloc(struct cu *cu, size_t size) > dwarves.c: return obstack_zalloc(&cu->obstack, size); > dwarves.c: return zalloc(size); > dwarves.c: struct cu *cu = zalloc(sizeof(*cu) + build_id_len); > elf_symtab.c: struct elf_symtab *symtab = zalloc(sizeof(*symtab)); > libctf.c: struct ctf *ctf = zalloc(sizeof(*ctf)); > pahole.c: struct prototype *prototype = zalloc(sizeof(*prototype) + strlen(expression) + 1); > ⬢[acme@toolbox pahole]$ >