Re: FAILED: load BTF from vmlinux: Invalid argument

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 04/08/2023 17:11, Nick Desaulniers wrote:
> + Marcus (who also just reported seeing this
> https://github.com/ClangBuiltLinux/linux/issues/1825#issuecomment-1664671027
> and might be able to help reproduce).
> + Fangrui (because seeing dd used as a result of 90ceddcb4950 makes me shudder)
> 
> On Thu, Aug 3, 2023 at 3:10 PM Alan Maguire <alan.maguire@xxxxxxxxxx> wrote:
>>
>> On 03/08/2023 21:50, Nick Desaulniers wrote:
>>> On Thu, Aug 3, 2023 at 1:39 PM Arnaldo Carvalho de Melo <acme@xxxxxxxxxx> wrote:
>>>>
>>>> Em Thu, Aug 03, 2023 at 11:02:46AM -0700, Nick Desaulniers escreveu:
>>>>> Hi Martin (and BTF/BPF team),
>>>>> I've observed 2 user reports with the error from the subject of this email.
>>>>> https://github.com/ClangBuiltLinux/linux/issues/1825
>>>>> https://bbs.archlinux.org/viewtopic.php?id=284177
>>>>>
>>>>> Any chance you could take a look at these reports and help us figure
>>>>> out what's going wrong here?  Nathan and I haven't been able to
>>>>> reproduce, but this seems to be affecting OpenMandriva (and Tomasz).
>>>>>
>>>>> Sounds like perhaps llvm-objcopy vs gnu objcopy might be a relevant detail?
>>>>
>>>> Masami had a problem with new versions of compilers that was solved
>>>> with:
>>>>
>>>> ------------------------ 8< --------------------------------------------
>>>>> To check that please tweak:
>>>>>
>>>>> ⬢[acme@toolbox perf-tools-next]$ grep DWARF ../build/v6.2-rc5+/.config
>>>>> CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
>>>>> # CONFIG_DEBUG_INFO_DWARF4 is not set
>>>>> # CONFIG_DEBUG_INFO_DWARF5 is not set
>>>>> ⬢[acme@toolbox perf-tools-next]$
>>>>>
>>>>> i.e. disable CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT and enable
>>>>> CONFIG_DEBUG_INFO_DWARF4.
>>>>
>>>> Hm, with CONFIG_DEBUG_INFO_DWARF4, no warning were shown.
>>>
>>> Downgrading from the now-6-year-old DWARFv5 to now-13-year-old DWARFv4
>>> is not what I'd consider a fix. Someday we can move to
>>> DWARFv5...someday...
>>>
>>> What you describe sounds like build success, but reduction in debug info.
>>>
>>> The reports I'm referring to seem to result in a build failure.
>>>
>>
>> This is a strange one. The error in question
>>
>> CC .vmlinux.export.o
>> UPD include/generated/utsversion.h
>> CC init/version-timestamp.o
>> LD .tmp_vmlinux.btf
>> BTF .btf.vmlinux.bin.o
>> libbpf: BTF header not found
>> pahole: .tmp_vmlinux.btf: Invalid argument
> 
> That's slightly different from Tomasz and Marcus' report (not sure if
> that's relevant):
> 
> FAILED: load BTF from vmlinux: Invalid argument
> 
> That seems to come from
> tools/bpf/resolve_btfids/main.c:529
> Which seems like some failed call to btf_parse().
> EINVAL is getting propagated up from btf_parse(), but that's not super
> descriptive...
> 
Okay, that makes more sense. Basically the stage where we read vmlinux
BTF to do BTF id resolution (BTFIDS) is finding an empty BTF section.

> The hard part is that I suspect OpenMandriva (Tomasz) and Marcus are
> both setting additional flags in their toolchains, which can make
> reproducing tricky.
>

I tried falling back to the config referenced in the earlier bug report

https://github.com/ClangBuiltLinux/linux/files/10050200/config_bpf.txt

...but still couldn't reproduce it with LLVM 17 + pahole v1.24. That
config did specify DWARF5; if we can reproduce this, it would probably
be good to vary between forcing DWARF4 and DWARF5 to see if that is a
contributing factor as Arnaldo suggested.

Alan

>>
>> ...occurs during BTF parsing when the raw size of the BTF is smaller
>> than the BTF header size, which should never happen unless BTF
>> is corrupted. Thing is, at that stage we shouldn't be parsing BTF,
>> we should be generating it from DWARF. The only time pahole parses BTF
>> is when it's creating split BTF for modules (it parses the base BTF), or
>> when it's reading existing BTF, neither of which it should be doing at
>> this stage.
>>
>> But I suspect the issue is in gen_btf() in scripts/link-vmlinux.sh.
>> Prior to running pahole, we call "vmlinux_link .tmp_vmlinux.btf".
>> If that went awry somehow and .tmp_vmlinux.btf wasn't created, it
> 
> Wouldn't we expect some kind of linker error though in that case?
> 
>> would explain the "Invalid argument" error:
>>
>> $ pahole -J nosuchfile
>> pahole: nosuchfile: Invalid argument
>>
>> I see some clang specifics in vmlinux_link(), so I think a good
>> first step would be to check if .tmp_vlinux.btf exists prior
>> to running pahole. The submitter mentioned swapping linkers seems to
>> help, so that seems a promising angle. If there's a kernel .config
>> available I can try and reproduce the failure too. Thanks!
>>
>> Alan
>>
>>>>
>>>>   LD      .tmp_vmlinux.btf
>>>>   BTF     .btf.vmlinux.bin.o
>>>>   LD      .tmp_vmlinux.kallsyms1
>>>>
>>>> And
>>>>
>>>> / # strings /sys/kernel/btf/vmlinux | wc -l
>>>> 89921
>>>> / # strings /sys/kernel/btf/vmlinux | grep -w kfree
>>>> kfree
>>>>
>>>> It seems the BTF is correctly generated. (with DWARF5, the number of symbols
>>>> are about 30000.)
>>>
>>>
>>>
> 
> 
> 




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux