+ Marcus (who also just reported seeing this https://github.com/ClangBuiltLinux/linux/issues/1825#issuecomment-1664671027 and might be able to help reproduce). + Fangrui (because seeing dd used as a result of 90ceddcb4950 makes me shudder) On Thu, Aug 3, 2023 at 3:10 PM Alan Maguire <alan.maguire@xxxxxxxxxx> wrote: > > On 03/08/2023 21:50, Nick Desaulniers wrote: > > On Thu, Aug 3, 2023 at 1:39 PM Arnaldo Carvalho de Melo <acme@xxxxxxxxxx> wrote: > >> > >> Em Thu, Aug 03, 2023 at 11:02:46AM -0700, Nick Desaulniers escreveu: > >>> Hi Martin (and BTF/BPF team), > >>> I've observed 2 user reports with the error from the subject of this email. > >>> https://github.com/ClangBuiltLinux/linux/issues/1825 > >>> https://bbs.archlinux.org/viewtopic.php?id=284177 > >>> > >>> Any chance you could take a look at these reports and help us figure > >>> out what's going wrong here? Nathan and I haven't been able to > >>> reproduce, but this seems to be affecting OpenMandriva (and Tomasz). > >>> > >>> Sounds like perhaps llvm-objcopy vs gnu objcopy might be a relevant detail? > >> > >> Masami had a problem with new versions of compilers that was solved > >> with: > >> > >> ------------------------ 8< -------------------------------------------- > >>> To check that please tweak: > >>> > >>> ⬢[acme@toolbox perf-tools-next]$ grep DWARF ../build/v6.2-rc5+/.config > >>> CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y > >>> # CONFIG_DEBUG_INFO_DWARF4 is not set > >>> # CONFIG_DEBUG_INFO_DWARF5 is not set > >>> ⬢[acme@toolbox perf-tools-next]$ > >>> > >>> i.e. disable CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT and enable > >>> CONFIG_DEBUG_INFO_DWARF4. > >> > >> Hm, with CONFIG_DEBUG_INFO_DWARF4, no warning were shown. > > > > Downgrading from the now-6-year-old DWARFv5 to now-13-year-old DWARFv4 > > is not what I'd consider a fix. Someday we can move to > > DWARFv5...someday... > > > > What you describe sounds like build success, but reduction in debug info. > > > > The reports I'm referring to seem to result in a build failure. > > > > This is a strange one. The error in question > > CC .vmlinux.export.o > UPD include/generated/utsversion.h > CC init/version-timestamp.o > LD .tmp_vmlinux.btf > BTF .btf.vmlinux.bin.o > libbpf: BTF header not found > pahole: .tmp_vmlinux.btf: Invalid argument That's slightly different from Tomasz and Marcus' report (not sure if that's relevant): FAILED: load BTF from vmlinux: Invalid argument That seems to come from tools/bpf/resolve_btfids/main.c:529 Which seems like some failed call to btf_parse(). EINVAL is getting propagated up from btf_parse(), but that's not super descriptive... The hard part is that I suspect OpenMandriva (Tomasz) and Marcus are both setting additional flags in their toolchains, which can make reproducing tricky. > > ...occurs during BTF parsing when the raw size of the BTF is smaller > than the BTF header size, which should never happen unless BTF > is corrupted. Thing is, at that stage we shouldn't be parsing BTF, > we should be generating it from DWARF. The only time pahole parses BTF > is when it's creating split BTF for modules (it parses the base BTF), or > when it's reading existing BTF, neither of which it should be doing at > this stage. > > But I suspect the issue is in gen_btf() in scripts/link-vmlinux.sh. > Prior to running pahole, we call "vmlinux_link .tmp_vmlinux.btf". > If that went awry somehow and .tmp_vmlinux.btf wasn't created, it Wouldn't we expect some kind of linker error though in that case? > would explain the "Invalid argument" error: > > $ pahole -J nosuchfile > pahole: nosuchfile: Invalid argument > > I see some clang specifics in vmlinux_link(), so I think a good > first step would be to check if .tmp_vlinux.btf exists prior > to running pahole. The submitter mentioned swapping linkers seems to > help, so that seems a promising angle. If there's a kernel .config > available I can try and reproduce the failure too. Thanks! > > Alan > > >> > >> LD .tmp_vmlinux.btf > >> BTF .btf.vmlinux.bin.o > >> LD .tmp_vmlinux.kallsyms1 > >> > >> And > >> > >> / # strings /sys/kernel/btf/vmlinux | wc -l > >> 89921 > >> / # strings /sys/kernel/btf/vmlinux | grep -w kfree > >> kfree > >> > >> It seems the BTF is correctly generated. (with DWARF5, the number of symbols > >> are about 30000.) > > > > > > -- Thanks, ~Nick Desaulniers