Re: Latest libbpf fails to load programs compiled with old LLVM

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Dec 7, 2020 at 3:00 AM Toke Høiland-Jørgensen <toke@xxxxxxxxxx> wrote:
>
> Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> writes:
>
> > On Fri, Dec 4, 2020 at 9:55 AM Yonghong Song <yhs@xxxxxx> wrote:
> >>
> >>
> >>
> >> On 12/4/20 1:34 AM, Toke Høiland-Jørgensen wrote:
> >> > Yonghong Song <yhs@xxxxxx> writes:
> >> >
> >> >> On 12/3/20 9:55 AM, Toke Høiland-Jørgensen wrote:
> >> >>> Hi Andrii
> >> >>>
> >> >>> I noticed that recent libbpf versions fail to load BPF files compiled
> >> >>> with old versions of LLVM. E.g., if I compile xdp-tools with LLVM 7 I
> >> >>> get:
> >> >>>
> >> >>> $ sudo ./xdp-loader load testns ../lib/testing/xdp_drop.o -vv
> >> >>> Loading 1 files on interface 'testns'.
> >> >>> libbpf: loading ../lib/testing/xdp_drop.o
> >> >>> libbpf: elf: section(3) prog, size 16, link 0, flags 6, type=1
> >> >>> libbpf: sec 'prog': failed to find program symbol at offset 0
> >> >>> Couldn't open file '../lib/testing/xdp_drop.o': BPF object format invalid
> >> >>>
> >> >>> The 'failed to find program symbol' error seems to have been introduced
> >> >>> with commit c112239272c6 ("libbpf: Parse multi-function sections into
> >> >>> multiple BPF programs").
> >> >>>
> >> >>> Looking at the object file in question, indeed it seems to not have any
> >> >>> function symbols defined:
> >> >>>
> >> >>> $  llvm-objdump --syms ../lib/testing/xdp_drop.o
> >> >>>
> >> >>> ../lib/testing/xdp_drop.o:  file format elf64-bpf
> >> >>>
> >> >>> SYMBOL TABLE:
> >> >>> 0000000000000000 l       .debug_str 0000000000000000
> >> >>> 0000000000000037 l       .debug_str 0000000000000000
> >> >>> 0000000000000042 l       .debug_str 0000000000000000
> >> >>> 0000000000000068 l       .debug_str 0000000000000000
> >> >>> 0000000000000071 l       .debug_str 0000000000000000
> >> >>> 0000000000000076 l       .debug_str 0000000000000000
> >> >>> 000000000000008a l       .debug_str 0000000000000000
> >> >>> 0000000000000097 l       .debug_str 0000000000000000
> >> >>> 00000000000000a3 l       .debug_str 0000000000000000
> >> >>> 00000000000000ac l       .debug_str 0000000000000000
> >> >>> 00000000000000b5 l       .debug_str 0000000000000000
> >> >>> 00000000000000bc l       .debug_str 0000000000000000
> >> >>> 00000000000000c9 l       .debug_str 0000000000000000
> >> >>> 00000000000000d4 l       .debug_str 0000000000000000
> >> >>> 00000000000000dd l       .debug_str 0000000000000000
> >> >>> 00000000000000e1 l       .debug_str 0000000000000000
> >> >>> 00000000000000e5 l       .debug_str 0000000000000000
> >> >>> 00000000000000ea l       .debug_str 0000000000000000
> >> >>> 00000000000000f0 l       .debug_str 0000000000000000
> >> >>> 00000000000000f9 l       .debug_str 0000000000000000
> >> >>> 0000000000000103 l       .debug_str 0000000000000000
> >> >>> 0000000000000113 l       .debug_str 0000000000000000
> >> >>> 0000000000000122 l       .debug_str 0000000000000000
> >> >>> 0000000000000131 l       .debug_str 0000000000000000
> >> >>> 0000000000000000 l    d  prog       0000000000000000 prog
> >> >>> 0000000000000000 l    d  .debug_abbrev      0000000000000000 .debug_abbrev
> >> >>> 0000000000000000 l    d  .debug_info        0000000000000000 .debug_info
> >> >>> 0000000000000000 l    d  .debug_frame       0000000000000000 .debug_frame
> >> >>> 0000000000000000 l    d  .debug_line        0000000000000000 .debug_line
> >> >>> 0000000000000000 g       license    0000000000000000 _license
> >> >>> 0000000000000000 g       prog       0000000000000000 xdp_drop
> >> >>>
> >> >>>
> >> >>> I assume this is because old LLVM versions simply don't emit that symbol
> >> >>> information?
> >>
> >> Thanks for the below instruction and xdp_drop.c file. I can reproduce
> >> the issue now.
> >>
> >> I added another function 'xdp_drop1' in the same thing. Below is the
> >> symbol table with llvm7 vs. llvm12.
> >>
> >> -bash-4.4$ llvm-readelf -symbols xdp-7.o | grep xdp_drop
> >>      32: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT     3 xdp_drop
> >>      33: 0000000000000010     0 NOTYPE  GLOBAL DEFAULT     3 xdp_drop1
> >>
> >>    [ 3] prog              PROGBITS        0000000000000000 000040 000020
> >> 00  AX  0   0  8
> >>
> >> -bash-4.4$ llvm-readelf -symbols xdp-12.o | grep xdp_drop
> >>      32: 0000000000000000    16 FUNC    GLOBAL DEFAULT     3 xdp_drop
> >>      33: 0000000000000010    16 FUNC    GLOBAL DEFAULT     3 xdp_drop1
> >> -bash-4.4$
> >>
> >>    [ 3] prog              PROGBITS        0000000000000000 000040 000020
> >> 00  AX  0   0  8
> >>
> >>
> >> Yes, llvm7 does not encode type and size for FUNC's. I guess libbpf can
> >> change to recognize NOTYPE and use the symbol value (representing the
> >> offset from the start of the section) and section size to
> >> calculate the individual function size. This is more complicated than
> >> elf file providing FUNC type and symbol size directly.
> >
> > I think we should just face the fact that LLVM7 is way too old to
> > produce a sensible BPF ELF file layout. We can extend:
> >
> > libbpf: sec 'prog': failed to find program symbol at offset 0
> > Couldn't open file '../lib/testing/xdp_drop.o': BPF object format invalid
> >
> > with a suggestion to upgrade Clang/LLVM to something more recent, if
> > that would be helpful.
> >
> > But I don't want to add error-prone checks and assumptions in the
> > already quite complicated logic. Even the kernel itself maintains that
> > Clang 10+ needs to be used for its compilation. BPF CO-RE is also not
> > working with older than Clang10, so lots of people have already
> > upgraded way beyond that.
>
> Wait, what? This is a regression that *breaks people's programs* on
> compiler versions that are still very much in the wild! I mean, fine if
> you don't want to support new features on such files, but then surely we
> can at least revert back to the old behaviour?

This is clearly a bug in LLVM7, which didn't produce correct ELF
symbols, do we agree on that? libbpf used to handle such invalid ELF
files *by accident* until it changed its internal logic to be more
strict in v0.2. It became more strict and doesn't work with such
invalid ELF files anymore. Does it need to add extra quirks to support
such broken ELF? I don't think so.

Surely, users that can't upgrade LLVM7 to something less ancient, can
stick to libbpf v0.1, that was lenient enough to accept such invalid
ELF files. libbpf v0.2 was released more than a month ago, and so far
you are the only one who noticed this "regression". So hopefully it's
not super annoying to people and they would be accommodating enough to
use more up to date compiler (and save themselves lots of trouble
along the way).

>
> > Speaking of legacy. Toke, can you please update all the samples in
> > your xdp-tools repo to not use arbitrary sections names. I see
> > SEC("prog"), where it should really be SEC("xdp"). It sets a bad
> > example for newcomers, IMO.
>
> I used "prog" because that's what iproute2 looks for if you don't supply

Ok.

> a section name, so it makes it convenient to load programs with 'ip'
> without supplying the section name. However, I do realise this is not
> the best of reasons, and I am not opposed to changing it. However...
>
> > I'm also going to emit warnings in libbpf soon for section names that
> > don't follow proper libbpf naming pattern, so it would be good if you
> > could get ahead of the curve.
>
> ...this sounds like just another way to annoy users by breaking things
> that were working before? :/

It won't break, libbpf will emit a warning about the need to use
proper section name format, which will start to be enforced only with
major version bump. So that will give users plenty of time to make
sure their BPF programs are compatible with stricter libbpf.

>
> -Toke
>




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux