On Fri, Aug 04, 2023 at 11:03:03PM +0100, Alan Maguire wrote: > On 04/08/2023 17:11, Nick Desaulniers wrote: > > + Marcus (who also just reported seeing this > > https://github.com/ClangBuiltLinux/linux/issues/1825#issuecomment-1664671027 > > and might be able to help reproduce). > > + Fangrui (because seeing dd used as a result of 90ceddcb4950 makes me shudder) > > > > On Thu, Aug 3, 2023 at 3:10 PM Alan Maguire <alan.maguire@xxxxxxxxxx> wrote: > >> > >> On 03/08/2023 21:50, Nick Desaulniers wrote: > >>> On Thu, Aug 3, 2023 at 1:39 PM Arnaldo Carvalho de Melo <acme@xxxxxxxxxx> wrote: > >>>> > >>>> Em Thu, Aug 03, 2023 at 11:02:46AM -0700, Nick Desaulniers escreveu: > >>>>> Hi Martin (and BTF/BPF team), > >>>>> I've observed 2 user reports with the error from the subject of this email. > >>>>> https://github.com/ClangBuiltLinux/linux/issues/1825 > >>>>> https://bbs.archlinux.org/viewtopic.php?id=284177 > >>>>> > >>>>> Any chance you could take a look at these reports and help us figure > >>>>> out what's going wrong here? Nathan and I haven't been able to > >>>>> reproduce, but this seems to be affecting OpenMandriva (and Tomasz). > >>>>> > >>>>> Sounds like perhaps llvm-objcopy vs gnu objcopy might be a relevant detail? > >>>> > >>>> Masami had a problem with new versions of compilers that was solved > >>>> with: > >>>> > >>>> ------------------------ 8< -------------------------------------------- > >>>>> To check that please tweak: > >>>>> > >>>>> ⬢[acme@toolbox perf-tools-next]$ grep DWARF ../build/v6.2-rc5+/.config > >>>>> CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y > >>>>> # CONFIG_DEBUG_INFO_DWARF4 is not set > >>>>> # CONFIG_DEBUG_INFO_DWARF5 is not set > >>>>> ⬢[acme@toolbox perf-tools-next]$ > >>>>> > >>>>> i.e. disable CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT and enable > >>>>> CONFIG_DEBUG_INFO_DWARF4. > >>>> > >>>> Hm, with CONFIG_DEBUG_INFO_DWARF4, no warning were shown. > >>> > >>> Downgrading from the now-6-year-old DWARFv5 to now-13-year-old DWARFv4 > >>> is not what I'd consider a fix. Someday we can move to > >>> DWARFv5...someday... > >>> > >>> What you describe sounds like build success, but reduction in debug info. > >>> > >>> The reports I'm referring to seem to result in a build failure. > >>> > >> > >> This is a strange one. The error in question > >> > >> CC .vmlinux.export.o > >> UPD include/generated/utsversion.h > >> CC init/version-timestamp.o > >> LD .tmp_vmlinux.btf > >> BTF .btf.vmlinux.bin.o > >> libbpf: BTF header not found > >> pahole: .tmp_vmlinux.btf: Invalid argument > > > > That's slightly different from Tomasz and Marcus' report (not sure if > > that's relevant): > > > > FAILED: load BTF from vmlinux: Invalid argument > > > > That seems to come from > > tools/bpf/resolve_btfids/main.c:529 > > Which seems like some failed call to btf_parse(). > > EINVAL is getting propagated up from btf_parse(), but that's not super > > descriptive... > > > Okay, that makes more sense. Basically the stage where we read vmlinux > BTF to do BTF id resolution (BTFIDS) is finding an empty BTF section. +1, looks like pahole failed to generate the BTF section, the BTFIDS is just follow up error.. we might want to consider special error output for missing BTF data ;-) I can't reproduce this on my setup with either gcc or clang and trying DWARF4/5 config options and latest and 1.24 pahole version > > > The hard part is that I suspect OpenMandriva (Tomasz) and Marcus are > > both setting additional flags in their toolchains, which can make > > reproducing tricky. > > > > I tried falling back to the config referenced in the earlier bug report > > https://github.com/ClangBuiltLinux/linux/files/10050200/config_bpf.txt hum, I did not find this in the report.. are there more kernel configs related to this issue? seems like more people hit this thanks, jirka > > ...but still couldn't reproduce it with LLVM 17 + pahole v1.24. That > config did specify DWARF5; if we can reproduce this, it would probably > be good to vary between forcing DWARF4 and DWARF5 to see if that is a > contributing factor as Arnaldo suggested. > > Alan > > >> > >> ...occurs during BTF parsing when the raw size of the BTF is smaller > >> than the BTF header size, which should never happen unless BTF > >> is corrupted. Thing is, at that stage we shouldn't be parsing BTF, > >> we should be generating it from DWARF. The only time pahole parses BTF > >> is when it's creating split BTF for modules (it parses the base BTF), or > >> when it's reading existing BTF, neither of which it should be doing at > >> this stage. > >> > >> But I suspect the issue is in gen_btf() in scripts/link-vmlinux.sh. > >> Prior to running pahole, we call "vmlinux_link .tmp_vmlinux.btf". > >> If that went awry somehow and .tmp_vmlinux.btf wasn't created, it > > > > Wouldn't we expect some kind of linker error though in that case? > > > >> would explain the "Invalid argument" error: > >> > >> $ pahole -J nosuchfile > >> pahole: nosuchfile: Invalid argument > >> > >> I see some clang specifics in vmlinux_link(), so I think a good > >> first step would be to check if .tmp_vlinux.btf exists prior > >> to running pahole. The submitter mentioned swapping linkers seems to > >> help, so that seems a promising angle. If there's a kernel .config > >> available I can try and reproduce the failure too. Thanks! > >> > >> Alan > >> > >>>> > >>>> LD .tmp_vmlinux.btf > >>>> BTF .btf.vmlinux.bin.o > >>>> LD .tmp_vmlinux.kallsyms1 > >>>> > >>>> And > >>>> > >>>> / # strings /sys/kernel/btf/vmlinux | wc -l > >>>> 89921 > >>>> / # strings /sys/kernel/btf/vmlinux | grep -w kfree > >>>> kfree > >>>> > >>>> It seems the BTF is correctly generated. (with DWARF5, the number of symbols > >>>> are about 30000.) > >>> > >>> > >>> > > > > > > >