Re: [PATCH 1/1] btf_encoder: Generate a new .BTF section even if one exists

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Dec 15, 2022 at 4:10 PM Bill Wendling <morbo@xxxxxxxxxx> wrote:
>
> On Thu, Dec 15, 2022 at 3:00 PM Bill Wendling <morbo@xxxxxxxxxx> wrote:
> >
> >  On Wed, Dec 7, 2022 at 12:33 PM Andrii Nakryiko
> > <andrii.nakryiko@xxxxxxxxx> wrote:
> > >
> > > On Wed, Dec 7, 2022 at 12:16 PM Bill Wendling <morbo@xxxxxxxxxx> wrote:
> > > >
> > > > On Tue, Dec 6, 2022 at 2:53 PM Andrii Nakryiko
> > > > <andrii.nakryiko@xxxxxxxxx> wrote:
> > > > >
> > > > > adding bpf@vger back
> > > > >
> > > > > On Tue, Dec 6, 2022 at 12:15 PM Bill Wendling <morbo@xxxxxxxxxx> wrote:
> > > > > >
> > > > > > On Tue, Dec 6, 2022 at 10:38 AM Andrii Nakryiko
> > > > > > <andrii.nakryiko@xxxxxxxxx> wrote:
> > > > > > >
> > > > > > > On Thu, Dec 1, 2022 at 12:20 PM Bill Wendling <morbo@xxxxxxxxxx> wrote:
> > > > > > > >
> > > > > > > > On Thu, Dec 1, 2022 at 11:56 AM Andrii Nakryiko
> > > > > > > > <andrii.nakryiko@xxxxxxxxx> wrote:
> > > > > > > > >
> > > > > > > > > On Wed, Nov 30, 2022 at 4:21 PM Bill Wendling <morbo@xxxxxxxxxx> wrote:
> > > > > > > > > >
> > > > > > > > > > On Wed, Nov 30, 2022 at 2:59 PM Andrii Nakryiko
> > > > > > > > > > <andrii.nakryiko@xxxxxxxxx> wrote:
> > > > > > > > > > >
> > > > > > > > > > > On Mon, Nov 21, 2022 at 4:00 PM Bill Wendling <morbo@xxxxxxxxxx> wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > LLD generates a zero-length .BTF section (BFD doesn't generate this
> > > > > > > > > > > > section). It shares the same address as .BTF_ids (or any section below
> > > > > > > > > > > > it). E.g.:
> > > > > > > > > > > >
> > > > > > > > > > > >   [24] .BTF              PROGBITS        ffffffff825a1900 17a1900 000000
> > > > > > > > > > > >   [25] .BTF_ids          PROGBITS        ffffffff825a1900 17a1900 000634
> > > > > > > > > > > >
> > > > > > > > > > > > Writing new data to that section doesn't adjust the addresses of
> > > > > > > > > > > > following sections. As a result, the "-J" flag produces a corrupted
> > > > > > > > > > > > file, causing further commands to fail.
> > > > > > > > > > > >
> > > > > > > > > > > > Instead of trying to adjust everything, just add a new section with the
> > > > > > > > > > > > .BTF data and adjust the name of the original .BTF section. (We can't
> > > > > > > > > > > > remove the old .BTF section because it has variables that are referenced
> > > > > > > > > > > > elsewhere.)
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Have you tried llvm-objcopy --update-section instead? Doesn't it work?
> > > > > > > > > > >
> > > > > > > > > > I gave it a quick try and it fails for me with this:
> > > > > > > > > >
> > > > > > > > > > llvm-objcopy: error: '.tmp_vmlinux.btf': cannot fit data of size
> > > > > > > > > > 4470718 into section '.BTF' with size 0 that is part of a segment
> > > > > > > > >
> > > > > > > > > .BTF shouldn't be allocatable section, when added by pahole. I think
> > > > > > > > > this is the problem. Can you confirm that that zero-sized .BTF is
> > > > > > > > > marked as allocated and is put into one of ELF segments? Can we fix
> > > > > > > > > that instead?
> > > > > > > > >
> > > > > > > > I think it does:
> > > > > > > >
> > > > > > > > [24] .BTF              PROGBITS        ffffffff825a1900 17a1900 000000
> > > > > > > > 00  WA  0   0  1
> > > > > > > >
> > > > > > >
> > > > > > > So this allocatable .BTF section, could it be because of linker script
> > > > > > > in include/asm-generic/vmlinux.lds.h? Should we add some conditions
> > > > > > > there to not emit .BTF if __startt_BTF == __stop_BTF (i.e., no BTF
> > > > > > > data is present) to avoid this issue in the first place?
> > > > > > >
> > > > > > It looks like keeping the .BTF section around is intentional:
> > > > > >
> > > > > >   commit 65c204398928 ("bpf: Prevent .BTF section elimination")
> > > > > >
> > > > > > I assume that patch isn't meant if the section is zero sized...
> > > > >
> > > > > yep, we need to keep it only if it's non-empty
> > > > >
> > > > > >
> > > > > > I was able to get a working system with two patches: one to Linux and
> > > > > > one to pahole. The Linux patch specifies that the .BTF section
> > > > > > shouldn't be allocatable.
> > > > >
> > > > > That's not right, we do want this .BTF section to be allocatable,
> > > > > kernel expects this content to be accessible at runtime. So Linux-side
> > > > > change is wrong. Is it possible to add some conditional statement to
> > > > > linker script to keep .BTF only if .BTF is non-empty?
> > > > >
> > > > I thought you said the .BTF section shouldn't be allocatable. Is that
> > > > only when it's added by pahole? The issue isn't really the section
> > > > that's added by pahole, but the section as it's generated by LLD.
> > >
> > > Yeah, it's confusing. Pahole is not a linker and can't properly embed
> > > .BTF into a data segment inside ELF. So the only choice is to add it
> > > as nono-allocatable ELF section.
> > >
> > > But .BTF as part of vmlinux image *has* to be loadable, as kernel from
> > > inside expect to have access to BTF contents. So that's why we use
> > > linker script to embed .BTF into data segment as allocatable.
> > >
> > > Generally, the process is that during vmlinux building we add .BTF
> > > contents to a temporary vmlinux file using pahole. Then during final
> > > linking (at the same time as we add kallsyms) we rely on linker to
> > > make .BTF allocatable and add those __start_BTF/__stop_BTF markers.
> > >
> > > Hope that clarifies this a bit.
> > >
> > I think I understand now. Thanks! :-)
> >
> > > >
> > > > I don't know of a way to add conditional code to a linker script. I
> > > > suspect we'd need the equivalent of this:
> > > >
> > > >   .BTF : AT(ADDR(.BTF) - LOAD_OFFSET) {
> > > >     __start_BTF = .;
> > > >     KEEP(*(.BTF))
> > > >     __stop_BTF = .;
> > > >   }
> > > >   SIZEOF(.BTF) == 0 && /DISCARD/ { *(.BTF) }   # This doesn't work.
> > >
> > > Honestly, no idea, I barely ever used linker scripts. Was hoping
> > > someone else will be able to figure this out and I won't have to learn
> > > this :)
> > >
> > > >
> > > > > > The pahole patch uses --update-section if
> > > > > > the section exists rather than writing out a new ELF file. Thoughts?
> > > > >
> > > > > That might be ok, because we already have dependency on llvm-objcopy.
> > > > > But also it's unnecessary change if the section in not allocated,
> > > > > right? Or why do we need to switch to llvm-objcopy in this case?
> > > > >
> > > > Not using llvm-objcopy was still messing up the ELF file. When you
> > > > used `readelf -lW .tmp_vmlinux.btf` the "Section to Segment mapping"
> > > > is trashed.
> > >
> > > If .BTF is not allocatable, there is no section to segment mapping, is there?
> > >
> > > >
> > > > I'm a bit worried still that even if we modify the Linux linker
> > > > scripts to remove a zero-sized .BTF section non-Linux projects using
> > > > pahole will hit this issue. (Or is Linux meant to be the sole user of
> > > > pahole?)
> > >
> > > That's the single most important case that we care about (note that
> > > the same thing happens for kernel modules). Nothing prevents others
> > > from using pahole for similar reasons with their custom apps.
> > >
> > > >
> > > > The purpose of the `-J` option is to add BTF data and the next command
> > > > in scripts/link-linux.sh extracts that data into its own file. The
> > > > .tmp_vmlinux.btf that pahole modified is then no longer used. Why not
> > > > cut out the middleman and have `-J` write the BTF data directly to a
> > > > file? Does it need to be in a special format?
> > > >
> > >
> > > That's exactly what I'm proposing:
> > >
> > > > > > > > > Also, more generally, newer paholes (not that new anymore, it's been a
> > > > > > > > > supported feature for a while) support emitting BTF as raw binary
> > > > > > > > > files, instead of embedding them into ELF. I think this is a nicer and
> > > > > > > > > simpler option and we should switch link-vmlinux.sh to use that
> > > > > > > > > instead, if pahole is new enough.
> > >
> > > We dump .BTF contents as raw bytes. Then embed with objcopy. That's the goal.
> > >
> > I knew I had a good idea! :-D
> >
> > I tried emitting the BTF data with the --btf_encode_detached flag.
> > However, it's emitted as a binary file, which the linker isn't able to
> > handle. We could process it afterwards with "objcopy", but I'm not
> > sure how to grab the correct output BFD name. So something like this:
> >
> > $ objcopy --input-target=binary --output-target=???
> > .btf.vmlinux.bin.o.raw .btf.vmlinux.bin.o
> >
> > What should I put in place of "???"? I can't seem to find a tool that
> > will give me a BFD name so that I could use the same type as the
> > ".tmp_vmlinux.btf.o" file.
> >
> Okay, what are your thoughts on this:
>
> diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
> index 918470d768e9..9355893acb12 100755
> --- a/scripts/link-vmlinux.sh
> +++ b/scripts/link-vmlinux.sh
> @@ -125,15 +125,27 @@ gen_btf()
>         vmlinux_link ${1}
>
>         info "BTF" ${2}
> -       LLVM_OBJCOPY="${OBJCOPY}" ${PAHOLE} -J ${PAHOLE_FLAGS} ${1}
>
> -       # Create ${2} which contains just .BTF section but no symbols. Add
> -       # SHF_ALLOC because .BTF will be part of the vmlinux image. --strip-all
> -       # deletes all symbols including __start_BTF and __stop_BTF, which will
> -       # be redefined in the linker script. Add 2>/dev/null to suppress GNU
> -       # objcopy warnings: "empty loadable segment detected at ..."
> -       ${OBJCOPY} --only-section=.BTF --set-section-flags .BTF=alloc,readonly \
> -               --strip-all ${1} ${2} 2>/dev/null
> +       # Write the raw BTF data instead of inserting it into the temp vmlinux
> +       # file. This avoids an issue where an existing .BTF section doesn't have
> +       # the appropriate size to handle the BTF data pahole wants to add. It
> +       # turns out that objcopy doesn't adjust the addresses of following
> +       # sections, resulting in the .BTF section overwriting data in other
> +       # sections. If the data is large, it could corrupt the file enough so
> +       # that it's no longer usable.
> +       LLVM_OBJCOPY="${OBJCOPY}" ${PAHOLE} -J
> --btf_encode_detached=${2}.raw ${1}

we can't do this unconditionally because --btf_encode_detached was
added a bit later and we still want to support old pahole versions. So
there will need to be some sort of version check.

> +
> +       # Create an empty ${2} ELF file then add the .BTF section from the raw
> +       # data.
> +       #     * Add SHF_ALLOC because .BTF will be part of the vmlinux image.
> +       #     * --strip-all deletes all symbols including __start_BTF and
> +        #       __stop_BTF, which are redefined in the linker script.
> +       #     * Add 2>/dev/null to suppress GNU objcopy warnings:
> +       #           "empty loadable segment detected at ..."
> +       ${CC} -o ${2} -c -x c - < /dev/null
> +       ${OBJCOPY} --add-section .BTF=${2}.raw --set-section-flags
> .BTF=alloc,readonly \

do we need to set alloc here? objcopy can't do anything smart about
alloc sections, so this seems misleading. Linker script should take of
making .BTF in final vmlinux allocatable?

> +               --strip-all ${2} 2>/dev/null
> +
>         # Change e_type to ET_REL so that it can be used to link final vmlinux.
>         # Unlike GNU ld, lld does not allow an ET_EXEC input.
>         printf '\1' | dd of=${2} conv=notrunc bs=1 seek=16 status=none
>
>
> > -bw
> >
> > > > -bw
> > > >
> > > > > >
> > > > > > Linux patch:
> > > > > >
> > > > > > diff --git a/include/asm-generic/vmlinux.lds.h
> > > > > > b/include/asm-generic/vmlinux.lds.h
> > > > > > index 3dc5824141cd..5bea090b736e 100644
> > > > > > --- a/include/asm-generic/vmlinux.lds.h
> > > > > > +++ b/include/asm-generic/vmlinux.lds.h
> > > > > > @@ -680,7 +680,7 @@
> > > > > >   */
> > > > > >  #ifdef CONFIG_DEBUG_INFO_BTF
> > > > > >  #define BTF                                                            \
> > > > > > -       .BTF : AT(ADDR(.BTF) - LOAD_OFFSET) {                           \
> > > > > > +       .BTF (INFO) : AT(ADDR(.BTF) - LOAD_OFFSET) {                    \
> > > > > >                 __start_BTF = .;                                        \
> > > > > >                 KEEP(*(.BTF))                                           \
> > > > > >                 __stop_BTF = .;                                         \
> > > > > >
> > > > > > pahole patch:
> > > > > >
> > >
> > > [...]
> > >
> > > > > >
> > > > > >
> > > > > > > > Fangrui mentioned something similar to this in a previous message:
> > > > > > > >
> > > > > > > >   https://lore.kernel.org/dwarves/20210317232657.mdnsuoqx6nbddjgt@xxxxxxxxxx/T/#u
> > > > > > > >
> > > > > > > > > Also, more generally, newer paholes (not that new anymore, it's been a
> > > > > > > > > supported feature for a while) support emitting BTF as raw binary
> > > > > > > > > files, instead of embedding them into ELF. I think this is a nicer and
> > > > > > > > > simpler option and we should switch link-vmlinux.sh to use that
> > > > > > > > > instead, if pahole is new enough.
> > > > > > > > >
> > > > > > > > > Hopefully eventually we can get rid of all the old pahole version
> > > > > > > > > cruft, but for now it's inevitable to support both modes, of course.
> > > > > > > > >
> > > > > > > >
> > > > > > > > Ah technical debt! :-)
> > > > > > >
> > > > > > > Yep, it would be good to get contributions to address it ;) It's
> > > > > > > better than hacks with renaming of sections, *wink wink* :)
> > > > > > >
> > > > > > ;-)
> > > > > >
> > > > > > -bw
> > > > > >
> > > > > > > >
> > > > > > > > -bw
> > > > > > > >
> > > > > > > > > > btf_encoder__write_elf: failed to add .BTF section to '.tmp_vmlinux.btf': 2!
> > > > > > > > > > Failed to encode BTF
> > > > > > > > > >
> > > > > > > > > > -bw
> > > > > > > > > >
> > > > > > > > > > > > Link: https://lore.kernel.org/dwarves/20210317232657.mdnsuoqx6nbddjgt@xxxxxxxxxx/
> > > > > > > > > > > > Cc: Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx>
> > > > > > > > > > > > Cc: Fangrui Song <maskray@xxxxxxxxxx>
> > > > > > > > > > > > Cc: dwarves@xxxxxxxxxxxxxxx
> > > > > > > > > > > > Signed-off-by: Bill Wendling <morbo@xxxxxxxxxx>
> > > > > > > > > > > > ---
> > > > > > > > > > > >  btf_encoder.c | 88 +++++++++++++++++++++++++++++++--------------------
> > > > > > > > > > > >  1 file changed, 54 insertions(+), 34 deletions(-)
> > > > > > > > > > > >
> > >
> > > [...]



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux