On Thu, Dec 15, 2022 at 4:10 PM Bill Wendling <morbo@xxxxxxxxxx> wrote: > > On Thu, Dec 15, 2022 at 3:00 PM Bill Wendling <morbo@xxxxxxxxxx> wrote: > > > > On Wed, Dec 7, 2022 at 12:33 PM Andrii Nakryiko > > <andrii.nakryiko@xxxxxxxxx> wrote: > > > > > > On Wed, Dec 7, 2022 at 12:16 PM Bill Wendling <morbo@xxxxxxxxxx> wrote: > > > > > > > > On Tue, Dec 6, 2022 at 2:53 PM Andrii Nakryiko > > > > <andrii.nakryiko@xxxxxxxxx> wrote: > > > > > > > > > > adding bpf@vger back > > > > > > > > > > On Tue, Dec 6, 2022 at 12:15 PM Bill Wendling <morbo@xxxxxxxxxx> wrote: > > > > > > > > > > > > On Tue, Dec 6, 2022 at 10:38 AM Andrii Nakryiko > > > > > > <andrii.nakryiko@xxxxxxxxx> wrote: > > > > > > > > > > > > > > On Thu, Dec 1, 2022 at 12:20 PM Bill Wendling <morbo@xxxxxxxxxx> wrote: > > > > > > > > > > > > > > > > On Thu, Dec 1, 2022 at 11:56 AM Andrii Nakryiko > > > > > > > > <andrii.nakryiko@xxxxxxxxx> wrote: > > > > > > > > > > > > > > > > > > On Wed, Nov 30, 2022 at 4:21 PM Bill Wendling <morbo@xxxxxxxxxx> wrote: > > > > > > > > > > > > > > > > > > > > On Wed, Nov 30, 2022 at 2:59 PM Andrii Nakryiko > > > > > > > > > > <andrii.nakryiko@xxxxxxxxx> wrote: > > > > > > > > > > > > > > > > > > > > > > On Mon, Nov 21, 2022 at 4:00 PM Bill Wendling <morbo@xxxxxxxxxx> wrote: > > > > > > > > > > > > > > > > > > > > > > > > LLD generates a zero-length .BTF section (BFD doesn't generate this > > > > > > > > > > > > section). It shares the same address as .BTF_ids (or any section below > > > > > > > > > > > > it). E.g.: > > > > > > > > > > > > > > > > > > > > > > > > [24] .BTF PROGBITS ffffffff825a1900 17a1900 000000 > > > > > > > > > > > > [25] .BTF_ids PROGBITS ffffffff825a1900 17a1900 000634 > > > > > > > > > > > > > > > > > > > > > > > > Writing new data to that section doesn't adjust the addresses of > > > > > > > > > > > > following sections. As a result, the "-J" flag produces a corrupted > > > > > > > > > > > > file, causing further commands to fail. > > > > > > > > > > > > > > > > > > > > > > > > Instead of trying to adjust everything, just add a new section with the > > > > > > > > > > > > .BTF data and adjust the name of the original .BTF section. (We can't > > > > > > > > > > > > remove the old .BTF section because it has variables that are referenced > > > > > > > > > > > > elsewhere.) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Have you tried llvm-objcopy --update-section instead? Doesn't it work? > > > > > > > > > > > > > > > > > > > > > I gave it a quick try and it fails for me with this: > > > > > > > > > > > > > > > > > > > > llvm-objcopy: error: '.tmp_vmlinux.btf': cannot fit data of size > > > > > > > > > > 4470718 into section '.BTF' with size 0 that is part of a segment > > > > > > > > > > > > > > > > > > .BTF shouldn't be allocatable section, when added by pahole. I think > > > > > > > > > this is the problem. Can you confirm that that zero-sized .BTF is > > > > > > > > > marked as allocated and is put into one of ELF segments? Can we fix > > > > > > > > > that instead? > > > > > > > > > > > > > > > > > I think it does: > > > > > > > > > > > > > > > > [24] .BTF PROGBITS ffffffff825a1900 17a1900 000000 > > > > > > > > 00 WA 0 0 1 > > > > > > > > > > > > > > > > > > > > > > So this allocatable .BTF section, could it be because of linker script > > > > > > > in include/asm-generic/vmlinux.lds.h? Should we add some conditions > > > > > > > there to not emit .BTF if __startt_BTF == __stop_BTF (i.e., no BTF > > > > > > > data is present) to avoid this issue in the first place? > > > > > > > > > > > > > It looks like keeping the .BTF section around is intentional: > > > > > > > > > > > > commit 65c204398928 ("bpf: Prevent .BTF section elimination") > > > > > > > > > > > > I assume that patch isn't meant if the section is zero sized... > > > > > > > > > > yep, we need to keep it only if it's non-empty > > > > > > > > > > > > > > > > > I was able to get a working system with two patches: one to Linux and > > > > > > one to pahole. The Linux patch specifies that the .BTF section > > > > > > shouldn't be allocatable. > > > > > > > > > > That's not right, we do want this .BTF section to be allocatable, > > > > > kernel expects this content to be accessible at runtime. So Linux-side > > > > > change is wrong. Is it possible to add some conditional statement to > > > > > linker script to keep .BTF only if .BTF is non-empty? > > > > > > > > > I thought you said the .BTF section shouldn't be allocatable. Is that > > > > only when it's added by pahole? The issue isn't really the section > > > > that's added by pahole, but the section as it's generated by LLD. > > > > > > Yeah, it's confusing. Pahole is not a linker and can't properly embed > > > .BTF into a data segment inside ELF. So the only choice is to add it > > > as nono-allocatable ELF section. > > > > > > But .BTF as part of vmlinux image *has* to be loadable, as kernel from > > > inside expect to have access to BTF contents. So that's why we use > > > linker script to embed .BTF into data segment as allocatable. > > > > > > Generally, the process is that during vmlinux building we add .BTF > > > contents to a temporary vmlinux file using pahole. Then during final > > > linking (at the same time as we add kallsyms) we rely on linker to > > > make .BTF allocatable and add those __start_BTF/__stop_BTF markers. > > > > > > Hope that clarifies this a bit. > > > > > I think I understand now. Thanks! :-) > > > > > > > > > > I don't know of a way to add conditional code to a linker script. I > > > > suspect we'd need the equivalent of this: > > > > > > > > .BTF : AT(ADDR(.BTF) - LOAD_OFFSET) { > > > > __start_BTF = .; > > > > KEEP(*(.BTF)) > > > > __stop_BTF = .; > > > > } > > > > SIZEOF(.BTF) == 0 && /DISCARD/ { *(.BTF) } # This doesn't work. > > > > > > Honestly, no idea, I barely ever used linker scripts. Was hoping > > > someone else will be able to figure this out and I won't have to learn > > > this :) > > > > > > > > > > > > > The pahole patch uses --update-section if > > > > > > the section exists rather than writing out a new ELF file. Thoughts? > > > > > > > > > > That might be ok, because we already have dependency on llvm-objcopy. > > > > > But also it's unnecessary change if the section in not allocated, > > > > > right? Or why do we need to switch to llvm-objcopy in this case? > > > > > > > > > Not using llvm-objcopy was still messing up the ELF file. When you > > > > used `readelf -lW .tmp_vmlinux.btf` the "Section to Segment mapping" > > > > is trashed. > > > > > > If .BTF is not allocatable, there is no section to segment mapping, is there? > > > > > > > > > > > I'm a bit worried still that even if we modify the Linux linker > > > > scripts to remove a zero-sized .BTF section non-Linux projects using > > > > pahole will hit this issue. (Or is Linux meant to be the sole user of > > > > pahole?) > > > > > > That's the single most important case that we care about (note that > > > the same thing happens for kernel modules). Nothing prevents others > > > from using pahole for similar reasons with their custom apps. > > > > > > > > > > > The purpose of the `-J` option is to add BTF data and the next command > > > > in scripts/link-linux.sh extracts that data into its own file. The > > > > .tmp_vmlinux.btf that pahole modified is then no longer used. Why not > > > > cut out the middleman and have `-J` write the BTF data directly to a > > > > file? Does it need to be in a special format? > > > > > > > > > > That's exactly what I'm proposing: > > > > > > > > > > > > Also, more generally, newer paholes (not that new anymore, it's been a > > > > > > > > > supported feature for a while) support emitting BTF as raw binary > > > > > > > > > files, instead of embedding them into ELF. I think this is a nicer and > > > > > > > > > simpler option and we should switch link-vmlinux.sh to use that > > > > > > > > > instead, if pahole is new enough. > > > > > > We dump .BTF contents as raw bytes. Then embed with objcopy. That's the goal. > > > > > I knew I had a good idea! :-D > > > > I tried emitting the BTF data with the --btf_encode_detached flag. > > However, it's emitted as a binary file, which the linker isn't able to > > handle. We could process it afterwards with "objcopy", but I'm not > > sure how to grab the correct output BFD name. So something like this: > > > > $ objcopy --input-target=binary --output-target=??? > > .btf.vmlinux.bin.o.raw .btf.vmlinux.bin.o > > > > What should I put in place of "???"? I can't seem to find a tool that > > will give me a BFD name so that I could use the same type as the > > ".tmp_vmlinux.btf.o" file. > > > Okay, what are your thoughts on this: > > diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh > index 918470d768e9..9355893acb12 100755 > --- a/scripts/link-vmlinux.sh > +++ b/scripts/link-vmlinux.sh > @@ -125,15 +125,27 @@ gen_btf() > vmlinux_link ${1} > > info "BTF" ${2} > - LLVM_OBJCOPY="${OBJCOPY}" ${PAHOLE} -J ${PAHOLE_FLAGS} ${1} > > - # Create ${2} which contains just .BTF section but no symbols. Add > - # SHF_ALLOC because .BTF will be part of the vmlinux image. --strip-all > - # deletes all symbols including __start_BTF and __stop_BTF, which will > - # be redefined in the linker script. Add 2>/dev/null to suppress GNU > - # objcopy warnings: "empty loadable segment detected at ..." > - ${OBJCOPY} --only-section=.BTF --set-section-flags .BTF=alloc,readonly \ > - --strip-all ${1} ${2} 2>/dev/null > + # Write the raw BTF data instead of inserting it into the temp vmlinux > + # file. This avoids an issue where an existing .BTF section doesn't have > + # the appropriate size to handle the BTF data pahole wants to add. It > + # turns out that objcopy doesn't adjust the addresses of following > + # sections, resulting in the .BTF section overwriting data in other > + # sections. If the data is large, it could corrupt the file enough so > + # that it's no longer usable. > + LLVM_OBJCOPY="${OBJCOPY}" ${PAHOLE} -J > --btf_encode_detached=${2}.raw ${1} we can't do this unconditionally because --btf_encode_detached was added a bit later and we still want to support old pahole versions. So there will need to be some sort of version check. > + > + # Create an empty ${2} ELF file then add the .BTF section from the raw > + # data. > + # * Add SHF_ALLOC because .BTF will be part of the vmlinux image. > + # * --strip-all deletes all symbols including __start_BTF and > + # __stop_BTF, which are redefined in the linker script. > + # * Add 2>/dev/null to suppress GNU objcopy warnings: > + # "empty loadable segment detected at ..." > + ${CC} -o ${2} -c -x c - < /dev/null > + ${OBJCOPY} --add-section .BTF=${2}.raw --set-section-flags > .BTF=alloc,readonly \ do we need to set alloc here? objcopy can't do anything smart about alloc sections, so this seems misleading. Linker script should take of making .BTF in final vmlinux allocatable? > + --strip-all ${2} 2>/dev/null > + > # Change e_type to ET_REL so that it can be used to link final vmlinux. > # Unlike GNU ld, lld does not allow an ET_EXEC input. > printf '\1' | dd of=${2} conv=notrunc bs=1 seek=16 status=none > > > > -bw > > > > > > -bw > > > > > > > > > > > > > > > > Linux patch: > > > > > > > > > > > > diff --git a/include/asm-generic/vmlinux.lds.h > > > > > > b/include/asm-generic/vmlinux.lds.h > > > > > > index 3dc5824141cd..5bea090b736e 100644 > > > > > > --- a/include/asm-generic/vmlinux.lds.h > > > > > > +++ b/include/asm-generic/vmlinux.lds.h > > > > > > @@ -680,7 +680,7 @@ > > > > > > */ > > > > > > #ifdef CONFIG_DEBUG_INFO_BTF > > > > > > #define BTF \ > > > > > > - .BTF : AT(ADDR(.BTF) - LOAD_OFFSET) { \ > > > > > > + .BTF (INFO) : AT(ADDR(.BTF) - LOAD_OFFSET) { \ > > > > > > __start_BTF = .; \ > > > > > > KEEP(*(.BTF)) \ > > > > > > __stop_BTF = .; \ > > > > > > > > > > > > pahole patch: > > > > > > > > > > > > [...] > > > > > > > > > > > > > > > > > > > > > > > Fangrui mentioned something similar to this in a previous message: > > > > > > > > > > > > > > > > https://lore.kernel.org/dwarves/20210317232657.mdnsuoqx6nbddjgt@xxxxxxxxxx/T/#u > > > > > > > > > > > > > > > > > Also, more generally, newer paholes (not that new anymore, it's been a > > > > > > > > > supported feature for a while) support emitting BTF as raw binary > > > > > > > > > files, instead of embedding them into ELF. I think this is a nicer and > > > > > > > > > simpler option and we should switch link-vmlinux.sh to use that > > > > > > > > > instead, if pahole is new enough. > > > > > > > > > > > > > > > > > > Hopefully eventually we can get rid of all the old pahole version > > > > > > > > > cruft, but for now it's inevitable to support both modes, of course. > > > > > > > > > > > > > > > > > > > > > > > > > Ah technical debt! :-) > > > > > > > > > > > > > > Yep, it would be good to get contributions to address it ;) It's > > > > > > > better than hacks with renaming of sections, *wink wink* :) > > > > > > > > > > > > > ;-) > > > > > > > > > > > > -bw > > > > > > > > > > > > > > > > > > > > > > -bw > > > > > > > > > > > > > > > > > > btf_encoder__write_elf: failed to add .BTF section to '.tmp_vmlinux.btf': 2! > > > > > > > > > > Failed to encode BTF > > > > > > > > > > > > > > > > > > > > -bw > > > > > > > > > > > > > > > > > > > > > > Link: https://lore.kernel.org/dwarves/20210317232657.mdnsuoqx6nbddjgt@xxxxxxxxxx/ > > > > > > > > > > > > Cc: Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> > > > > > > > > > > > > Cc: Fangrui Song <maskray@xxxxxxxxxx> > > > > > > > > > > > > Cc: dwarves@xxxxxxxxxxxxxxx > > > > > > > > > > > > Signed-off-by: Bill Wendling <morbo@xxxxxxxxxx> > > > > > > > > > > > > --- > > > > > > > > > > > > btf_encoder.c | 88 +++++++++++++++++++++++++++++++-------------------- > > > > > > > > > > > > 1 file changed, 54 insertions(+), 34 deletions(-) > > > > > > > > > > > > > > > > > > [...]