On 04/08/2023 17:11, Nick Desaulniers wrote: > + Marcus (who also just reported seeing this > https://github.com/ClangBuiltLinux/linux/issues/1825#issuecomment-1664671027 > and might be able to help reproduce). > + Fangrui (because seeing dd used as a result of 90ceddcb4950 makes me shudder) > > On Thu, Aug 3, 2023 at 3:10 PM Alan Maguire <alan.maguire@xxxxxxxxxx> wrote: >> >> On 03/08/2023 21:50, Nick Desaulniers wrote: >>> On Thu, Aug 3, 2023 at 1:39 PM Arnaldo Carvalho de Melo <acme@xxxxxxxxxx> wrote: >>>> >>>> Em Thu, Aug 03, 2023 at 11:02:46AM -0700, Nick Desaulniers escreveu: >>>>> Hi Martin (and BTF/BPF team), >>>>> I've observed 2 user reports with the error from the subject of this email. >>>>> https://github.com/ClangBuiltLinux/linux/issues/1825 >>>>> https://bbs.archlinux.org/viewtopic.php?id=284177 >>>>> >>>>> Any chance you could take a look at these reports and help us figure >>>>> out what's going wrong here? Nathan and I haven't been able to >>>>> reproduce, but this seems to be affecting OpenMandriva (and Tomasz). >>>>> >>>>> Sounds like perhaps llvm-objcopy vs gnu objcopy might be a relevant detail? >>>> >>>> Masami had a problem with new versions of compilers that was solved >>>> with: >>>> >>>> ------------------------ 8< -------------------------------------------- >>>>> To check that please tweak: >>>>> >>>>> ⬢[acme@toolbox perf-tools-next]$ grep DWARF ../build/v6.2-rc5+/.config >>>>> CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y >>>>> # CONFIG_DEBUG_INFO_DWARF4 is not set >>>>> # CONFIG_DEBUG_INFO_DWARF5 is not set >>>>> ⬢[acme@toolbox perf-tools-next]$ >>>>> >>>>> i.e. disable CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT and enable >>>>> CONFIG_DEBUG_INFO_DWARF4. >>>> >>>> Hm, with CONFIG_DEBUG_INFO_DWARF4, no warning were shown. >>> >>> Downgrading from the now-6-year-old DWARFv5 to now-13-year-old DWARFv4 >>> is not what I'd consider a fix. Someday we can move to >>> DWARFv5...someday... >>> >>> What you describe sounds like build success, but reduction in debug info. >>> >>> The reports I'm referring to seem to result in a build failure. >>> >> >> This is a strange one. The error in question >> >> CC .vmlinux.export.o >> UPD include/generated/utsversion.h >> CC init/version-timestamp.o >> LD .tmp_vmlinux.btf >> BTF .btf.vmlinux.bin.o >> libbpf: BTF header not found >> pahole: .tmp_vmlinux.btf: Invalid argument > > That's slightly different from Tomasz and Marcus' report (not sure if > that's relevant): > > FAILED: load BTF from vmlinux: Invalid argument > > That seems to come from > tools/bpf/resolve_btfids/main.c:529 > Which seems like some failed call to btf_parse(). > EINVAL is getting propagated up from btf_parse(), but that's not super > descriptive... > Okay, that makes more sense. Basically the stage where we read vmlinux BTF to do BTF id resolution (BTFIDS) is finding an empty BTF section. > The hard part is that I suspect OpenMandriva (Tomasz) and Marcus are > both setting additional flags in their toolchains, which can make > reproducing tricky. > I tried falling back to the config referenced in the earlier bug report https://github.com/ClangBuiltLinux/linux/files/10050200/config_bpf.txt ...but still couldn't reproduce it with LLVM 17 + pahole v1.24. That config did specify DWARF5; if we can reproduce this, it would probably be good to vary between forcing DWARF4 and DWARF5 to see if that is a contributing factor as Arnaldo suggested. Alan >> >> ...occurs during BTF parsing when the raw size of the BTF is smaller >> than the BTF header size, which should never happen unless BTF >> is corrupted. Thing is, at that stage we shouldn't be parsing BTF, >> we should be generating it from DWARF. The only time pahole parses BTF >> is when it's creating split BTF for modules (it parses the base BTF), or >> when it's reading existing BTF, neither of which it should be doing at >> this stage. >> >> But I suspect the issue is in gen_btf() in scripts/link-vmlinux.sh. >> Prior to running pahole, we call "vmlinux_link .tmp_vmlinux.btf". >> If that went awry somehow and .tmp_vmlinux.btf wasn't created, it > > Wouldn't we expect some kind of linker error though in that case? > >> would explain the "Invalid argument" error: >> >> $ pahole -J nosuchfile >> pahole: nosuchfile: Invalid argument >> >> I see some clang specifics in vmlinux_link(), so I think a good >> first step would be to check if .tmp_vlinux.btf exists prior >> to running pahole. The submitter mentioned swapping linkers seems to >> help, so that seems a promising angle. If there's a kernel .config >> available I can try and reproduce the failure too. Thanks! >> >> Alan >> >>>> >>>> LD .tmp_vmlinux.btf >>>> BTF .btf.vmlinux.bin.o >>>> LD .tmp_vmlinux.kallsyms1 >>>> >>>> And >>>> >>>> / # strings /sys/kernel/btf/vmlinux | wc -l >>>> 89921 >>>> / # strings /sys/kernel/btf/vmlinux | grep -w kfree >>>> kfree >>>> >>>> It seems the BTF is correctly generated. (with DWARF5, the number of symbols >>>> are about 30000.) >>> >>> >>> > > >