On Sat, Apr 03, 2021 at 02:08:01PM -0300, Arnaldo Carvalho de Melo wrote: > Em Fri, Apr 02, 2021 at 12:41:47PM -0700, Yonghong Song escreveu: > > > > > > On 4/2/21 11:08 AM, Arnaldo wrote: > > > > > > > > > On April 2, 2021 2:42:21 PM GMT-03:00, Yonghong Song <yhs@xxxxxx> wrote: > > > > On 4/2/21 10:23 AM, Yonghong Song wrote: > > > :> Thanks. I checked out the branch and did some testing with latest > > > > clang > > > > > trunk (just pulled in). > > > > > > > > > > With kernel LTO note support, I tested gcc non-lto, and llvm-lto > > > > mode, > > > > > it works fine. > > > > > > > > > > Without kernel LTO note support, I tested > > > > > gcc non-lto <=== ok > > > > > llvm non-lto <=== not ok > > > > > llvm lto <=== ok > > > > > > > > > > Surprisingly llvm non-lto vmlinux had the same "tcp_slow_start" > > > > issue. > > > > > Some previous version of clang does not have this issue. > > > > > I double checked the dwarfdump and it is indeed has the same reason > > > > > for lto vmlinux. I checked abbrev section and there is no cross-cu > > > > > references. > > > > > > > > > > That means we need to adapt this patch > > > > > dwarf_loader: Handle subprogram ret type with abstract_origin > > > > properly > > > > > for non merging case as well. > > > > > The previous patch fixed lto subprogram abstract_origin issue, > > > > > I will submit a followup patch for this. > > > > > > > > Actually, the change is pretty simple, > > > > > > > > diff --git a/dwarf_loader.c b/dwarf_loader.c > > > > index 5dea837..82d7131 100644 > > > > --- a/dwarf_loader.c > > > > +++ b/dwarf_loader.c > > > > @@ -2323,7 +2323,11 @@ static int die__process_and_recode(Dwarf_Die > > > > *die, struct cu *cu) > > > > int ret = die__process(die, cu); > > > > if (ret != 0) > > > > return ret; > > > > - return cu__recode_dwarf_types(cu); > > > > + ret = cu__recode_dwarf_types(cu); > > > > + if (ret != 0) > > > > + return ret; > > > > + > > > > + return cu__resolve_func_ret_types(cu); > > > > } > > > > > > > > Arnaldo, do you just want to fold into previous patches, or > > > > you want me to submit a new one? > > > > > > I can take care of that. > > > > > > And I think it's time for to look at Jiri's test suite... :-) heya, the test did not get change in generated BTF data with the change below, so looks good jirka > > > > > > It's a holiday here, so I'll take some time to get to this, hopefully I'll tag 1.21 tomorrow tho. > > > > Thanks for taking care of this! Right, 1.21 looks very close. > > So our HEAD now is this: > > > commit c03cd5c6ae0cec97d23edcf0a343bbff3df3c96a > Author: Yonghong Song <yhs@xxxxxx> > Date: Thu Apr 1 16:55:34 2021 -0700 > > dwarf_loader: Handle subprogram ret type with abstract_origin properly > > With latest bpf-next built with clang LTO (thin or full), I hit one test > failures: > > $ ./test_progs -t tcp > ... > libbpf: extern (func ksym) 'tcp_slow_start': func_proto [23] incompatible with kernel [115303] > libbpf: failed to load object 'bpf_cubic' > libbpf: failed to load BPF skeleton 'bpf_cubic': -22 > test_cubic:FAIL:bpf_cubic__open_and_load failed > #9/2 cubic:FAIL > ... > > The reason of the failure is due to bpf program 'tcp_slow_start' func > signature is different from vmlinux BTF. bpf program uses the following > signature: > > extern __u32 tcp_slow_start(struct tcp_sock *tp, __u32 acked); > > which is identical to the kernel definition in linux:include/net/tcp.h: > > u32 tcp_slow_start(struct tcp_sock *tp, u32 acked); > > While vmlinux BTF definition like: > > [115303] FUNC_PROTO '(anon)' ret_type_id=0 vlen=2 > 'tp' type_id=39373 > 'acked' type_id=18 > [115304] FUNC 'tcp_slow_start' type_id=115303 linkage=static > > The above is dumped with `bpftool btf dump file vmlinux`. > > You can see the ret_type_id is 0 and this caused the problem. > > Looking at dwarf, we have: > > 0x11f2ec67: DW_TAG_subprogram > DW_AT_low_pc (0xffffffff81ed2330) > DW_AT_high_pc (0xffffffff81ed235c) > DW_AT_frame_base () > DW_AT_GNU_all_call_sites (true) > DW_AT_abstract_origin (0x11f2ed66 "tcp_slow_start") > ... > 0x11f2ed66: DW_TAG_subprogram > DW_AT_name ("tcp_slow_start") > DW_AT_decl_file ("/home/yhs/work/bpf-next/net/ipv4/tcp_cong.c") > DW_AT_decl_line (392) > DW_AT_prototyped (true) > DW_AT_type (0x11f130c2 "u32") > DW_AT_external (true) > DW_AT_inline (DW_INL_inlined) > > We have a subprogram which has an abstract_origin pointing to the > subprogram prototype with return type. Current one pass recoding cannot > easily resolve this easily since at the time recoding for 0x11f2ec67, > the return type in 0x11f2ed66 has not been resolved. > > To simplify implementation, I just added another pass to go through all > functions after recoding pass. This should resolve the above issue. > > With this patch, among total 250999 functions in vmlinux, 4821 functions > needs return type adjustment from type id 0 to correct values. The above > failed bpf selftest passed too. > > Committer testing: > > Before: > > $ pfunct tcp_slow_start > void tcp_slow_start(struct tcp_sock * tp, u32 acked); > $ > $ pfunct --prototypes /sys/kernel/btf/vmlinux > before > $ head before > int fb_is_primary_device(struct fb_info * info); > int arch_resume_nosmt(void); > int relocate_restore_code(void); > int arch_hibernation_header_restore(void * addr); > int get_e820_md5(struct e820_table * table, void * buf); > int arch_hibernation_header_save(void * addr, unsigned int max_size); > int pfn_is_nosave(long unsigned int pfn); > int swsusp_arch_resume(void); > int amd_bus_cpu_online(unsigned int cpu); > void pci_enable_pci_io_ecs(void); > $ > > After: > > $ pfunct -F btf ../build/bpf_clang_thin_lto/vmlinux -f tcp_slow_start > u32 tcp_slow_start(struct tcp_sock * tp, u32 acked); > $ > $ pfunct -F btf --prototypes ../build/bpf_clang_thin_lto/vmlinux > after > $ > $ head after > int fb_is_primary_device(struct fb_info * info); > int arch_resume_nosmt(void); > int relocate_restore_code(void); > int arch_hibernation_header_restore(void * addr); > int get_e820_md5(struct e820_table * table, void * buf); > int arch_hibernation_header_save(void * addr, unsigned int max_size); > int pfn_is_nosave(long unsigned int pfn); > int swsusp_arch_resume(void); > int amd_bus_cpu_online(unsigned int cpu); > void pci_enable_pci_io_ecs(void); > $ > $ diff -u before after | grep ^+ | wc -l > 1604 > $ > > $ diff -u before after | grep tcp_slow_start > -void tcp_slow_start(struct tcp_sock * tp, u32 acked); > +u32 tcp_slow_start(struct tcp_sock * tp, u32 acked); > $ > $ diff -u before after | grep ^[+-] | head > --- before 2021-04-02 11:35:15.578160795 -0300 > +++ after 2021-04-02 11:33:34.204847317 -0300 > -void set_bf_sort(const struct dmi_system_id * d); > +int set_bf_sort(const struct dmi_system_id * d); > -void raw_pci_write(unsigned int domain, unsigned int bus, unsigned int devfn, int reg, int len, u32 val); > -void raw_pci_read(unsigned int domain, unsigned int bus, unsigned int devfn, int reg, int len, u32 * val); > +int raw_pci_write(unsigned int domain, unsigned int bus, unsigned int devfn, int reg, int len, u32 val); > +int raw_pci_read(unsigned int domain, unsigned int bus, unsigned int devfn, int reg, int len, u32 * val); > -void xen_find_device_domain_owner(struct pci_dev * dev); > +int xen_find_device_domain_owner(struct pci_dev * dev); > $ > > The same results are obtained if using /sys/kernel/btf/vmlinux after > rebooting with the kernel built from the ../build/bpf_clang_thin_lto/vmlinux > file used in the above 'after' examples. > > Signed-off-by: Yonghong Song <yhs@xxxxxx> > Tested-by: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx> > Acked-by: David Blaikie <dblaikie@xxxxxxxxx> > Cc: Alexei Starovoitov <ast@xxxxxxxxxx> > Cc: Bill Wendling <morbo@xxxxxxxxxx> > Cc: Nick Desaulniers <ndesaulniers@xxxxxxxxxx> > Cc: bpf@xxxxxxxxxxxxxxx > Cc: dwarves@xxxxxxxxxxxxxxx > Cc: kernel-team@xxxxxx > Signed-off-by: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx> > > diff --git a/dwarf_loader.c b/dwarf_loader.c > index 026d13789ff912be..5dea8378d7d2a30b 100644 > --- a/dwarf_loader.c > +++ b/dwarf_loader.c > @@ -2198,6 +2198,34 @@ out: > return 0; > } > > +static int cu__resolve_func_ret_types(struct cu *cu) > +{ > + struct ptr_table *pt = &cu->functions_table; > + uint32_t i; > + > + for (i = 0; i < pt->nr_entries; ++i) { > + struct tag *tag = pt->entries[i]; > + > + if (tag == NULL || tag->type != 0) > + continue; > + > + struct function *fn = tag__function(tag); > + if (!fn->abstract_origin) > + continue; > + > + struct dwarf_tag *dtag = tag->priv; > + struct dwarf_tag *dfunc; > + dfunc = dwarf_cu__find_tag_by_ref(cu->priv, &dtag->abstract_origin); > + if (dfunc == NULL) { > + tag__print_abstract_origin_not_found(tag); > + return -1; > + } > + > + tag->type = dfunc->tag->type; > + } > + return 0; > +} > + > static int cu__recode_dwarf_types_table(struct cu *cu, > struct ptr_table *pt, > uint32_t i) > @@ -2637,6 +2665,16 @@ static int cus__merge_and_process_cu(struct cus *cus, struct conf_load *conf, > /* process merged cu */ > if (cu__recode_dwarf_types(cu) != LSK__KEEPIT) > return DWARF_CB_ABORT; > + > + /* > + * for lto build, the function return type may not be > + * resolved due to the return type of a subprogram is > + * encoded in another subprogram through abstract_origin > + * tag. Let us visit all subprograms again to resolve this. > + */ > + if (cu__resolve_func_ret_types(cu) != LSK__KEEPIT) > + return DWARF_CB_ABORT; > + > if (finalize_cu_immediately(cus, cu, dcu, conf) > == LSK__STOP_LOADING) > return DWARF_CB_ABORT; >