Re: bpftool gen object doesn't handle GCC built BPF ELF files

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Sep 30, 2022 at 2:52 PM James Hilliard
<james.hilliard1@xxxxxxxxx> wrote:
>
> On Wed, Aug 3, 2022 at 3:24 PM James Hilliard <james.hilliard1@xxxxxxxxx> wrote:
> >
> > On Wed, Aug 3, 2022 at 10:57 AM Andrii Nakryiko
> > <andrii.nakryiko@xxxxxxxxx> wrote:
> > >
> > > On Tue, Aug 2, 2022 at 6:49 PM James Hilliard <james.hilliard1@xxxxxxxxx> wrote:
> > > >
> > > > On Tue, Aug 2, 2022 at 6:29 PM Andrii Nakryiko
> > > > <andrii.nakryiko@xxxxxxxxx> wrote:
> > > > >
> > > > > On Tue, Aug 2, 2022 at 3:06 PM James Hilliard <james.hilliard1@xxxxxxxxx> wrote:
> > > > > >
> > > > > > On Tue, Aug 2, 2022 at 3:29 PM Andrii Nakryiko
> > > > > > <andrii.nakryiko@xxxxxxxxx> wrote:
> > > > > > >
> > > > > > > On Mon, Aug 1, 2022 at 4:35 PM James Hilliard <james.hilliard1@xxxxxxxxx> wrote:
> > > > > > > >
> > > > > > > > On Mon, Aug 1, 2022 at 4:52 PM Andrii Nakryiko
> > > > > > > > <andrii.nakryiko@xxxxxxxxx> wrote:
> > > > > > > > >
> > > > > > > > > On Sun, Jul 31, 2022 at 7:20 PM James Hilliard
> > > > > > > > > <james.hilliard1@xxxxxxxxx> wrote:
> > > > > > > > > >
> > > > > > > > > > On Sun, Jul 10, 2022 at 2:22 PM Jose E. Marchesi
> > > > > > > > > > <jose.marchesi@xxxxxxxxxx> wrote:
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > >>> On Sun, Jul 10, 2022 at 3:38 AM Jose E. Marchesi
> > > > > > > > > > > >>> <jose.marchesi@xxxxxxxxxx> wrote:
> > > > > > > > > > > >>>>
> > > > > > > > > > > >>>>
> > > > > > > > > > > >>>> > On Sat, Jul 9, 2022 at 4:41 PM Jose E. Marchesi
> > > > > > > > > > > >>>> > <jose.marchesi@xxxxxxxxxx> wrote:
> > > > > > > > > > > >>>> >>
> > > > > > > > > > > >>>> >>
> > > > > > > > > > > >>>> >> > On Sat, Jul 9, 2022 at 2:32 PM James Hilliard <james.hilliard1@xxxxxxxxx> wrote:
> > > > > > > > > > > >>>> >> >>
> > > > > > > > > > > >>>> >> >> On Sat, Jul 9, 2022 at 2:21 PM Jose E. Marchesi
> > > > > > > > > > > >>>> >> >> <jose.marchesi@xxxxxxxxxx> wrote:
> > > > > > > > > > > >>>> >> >> >
> > > > > > > > > > > >>>> >> >> >
> > > > > > > > >
> > > > > > > > > Please trim your replies (and I don't know what your email client did,
> > > > > > > > > but it completely ruined nested quote formatting)
> > > > > > > >
> > > > > > > > Yeah, not sure what happened there.
> > > > > > > >
> > > > > > > > >
> > > > > > > > > [...]
> > > > > > > > >
> > > > > > > > > > > >>>>
> > > > > > > > > > > >>>> diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> > > > > > > > > > > >>>> index e89cc9c885b3..887b78780099 100644
> > > > > > > > > > > >>>> --- a/tools/lib/bpf/libbpf.c
> > > > > > > > > > > >>>> +++ b/tools/lib/bpf/libbpf.c
> > > > > > > > > > > >>>> @@ -1591,6 +1591,10 @@ static int bpf_object__init_global_data_maps(struct bpf_object *obj)
> > > > > > > > > > > >>>>         for (sec_idx = 1; sec_idx < obj->efile.sec_cnt; sec_idx++) {
> > > > > > > > > > > >>>>                 sec_desc = &obj->efile.secs[sec_idx];
> > > > > > > > > > > >>>>
> > > > > > > > > > > >>>> +                /* Skip recognized sections with size 0.  */
> > > > > > > > > > > >>>> +                if (sec_desc->data && sec_desc->data->d_size == 0)
> > > > > > > > > > > >>>> +                  continue;
> > > > > > > > > > > >>>> +
> > > > > > > > > > > >>>>                 switch (sec_desc->sec_type) {
> > > > > > > > > > > >>>>                 case SEC_DATA:
> > > > > > > > > > > >>>>                         sec_name = elf_sec_name(obj, elf_sec_by_idx(obj, sec_idx));
> > > > > > > > > > > >>>
> > > > > > > > > > > >>> Ok, skeleton is now getting generated successfully, however it differs from the
> > > > > > > > > > > >>> clang version so there's a build error when we include/use the header:
> > > > > > > > > > > >>> ../src/core/restrict-ifaces.c: In function ‘prepare_restrict_ifaces_bpf’:
> > > > > > > > > > > >>> ../src/core/restrict-ifaces.c:45:14: error: ‘struct
> > > > > > > > > > > >>> restrict_ifaces_bpf’ has no member named ‘rodata’; did you mean
> > > > > > > > > > > >>> ‘data’?
> > > > > > > > > > > >>>    45 |         obj->rodata->is_allow_list = is_allow_list;
> > > > > > > > > > > >>>       |              ^~~~~~
> > > > > > > > > > > >>>       |              data
> > > > > > > > > > > >>>
> > > > > > > > > > > >>> The issue appears to be that clang generates "rodata" members in
> > > > > > > > > > > >>> restrict_ifaces_bpf while with gcc we get "data" members instead.
> > > > > > > > > > > >>
> > > > > > > > > > > >> This is because the BPF GCC port is putting the
> > > > > > > > > > > >>
> > > > > > > > > > > >>   const volatile unsigned char is_allow_list = 0;
> > > > > > > > > > > >>
> > > > > > > > > > > >> in a .data section instead of .rodata, due to the `volatile'.  The
> > > > > > > > > > > >> x86_64 GCC seems to use .rodata.
> > > > > > > > > > > >>
> > > > > > > > > > > >> Looking at why the PBF port does this...
> > > > > > > > > > > >
> > > > > > > > > > > > So, turns out GCC puts zero-initialized `const volatile' variables in
> > > > > > > > > > > > .data sections (and not .rodata) in all the targets I have tried, like
> > > > > > > > > > > > x86_64 and aarch64.
> > > > > > > > > > > >
> > > > > > > > > > > > So this is a LLVM and GCC divergence :/
> > > > > > > > > > >
> > > > > > > > > > > See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=25521.
> > > > > > > > > > >
> > > > > > > > > > > You may try, as a workaround:
> > > > > > > > > > >
> > > > > > > > > > > __attribute__((section(".rodata"))) const volatile unsigned char is_allow_list = 0;
> > > > > > > > > > >
> > > > > > > > > > > But that will use permissions "aw" for the .rodata section (and you will
> > > > > > > > > > > get a warning from the assembler.)  It may be problematic for libbpf.
> > > > > > > > > >
> > > > > > > > > > So rather than try to force gcc to use the incorrect llvm .rodata
> > > > > > > > > > section it looks
> > > > > > > > > > like we can instead just force llvm to use the correct .data section like this:
> > > > > > > > > > https://github.com/systemd/systemd/pull/24164
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > There is a huge difference between variables in .rodata and .data.
> > > > > > > > > .rodata variable's value is known to the BPF verifier at verification
> > > > > > > > > time and this knowledge will be used to decide which code paths are
> > > > > > > > > always or never taken (as one example). It's a crucial property and
> > > > > > > > > important guarantee.
> > > > > > > > >
> > > > > > > > > If you don't care about that property, don't declare the variable as `const`.
> > > > > > > > >
> > > > > > > > > So no, it's not llvm putting `const` variable into .rodata
> > > > > > > > > incorrectly, but GCC is trying to be smart and just because variable
> > > > > > > > > is declared volatile is putting *const* variable into read-write .data
> > > > > > > > > section. It's declared as const, and yes it's volatile to make sure
> > > > > > > > > that compiler isn't too smart about optimizing away read operations.
> > > > > > > >
> > > > > > > > Isn't const volatile generating a .rodata section(like llvm is doing) a spec
> > > > > > > > violation?
> > > > > > > > https://github.com/llvm/llvm-project/issues/56468
> > > > > > > >
> > > > > > > > > But it's still a const read-only variable from the perspective of that
> > > > > > > > > BPF C code.
> > > > > > > > >
> > > > > > > > > If you don't care about the read-only nature of that variable, drop
> > > > > > > > > the const and make it into a non-read-only variable.
> > > > > > > > >
> > > > > > > > > And please stop proposing hacks to be added to perfectly valid systemd
> > > > > > > > > BPF source code (I replied on [0] as well).
> > > > > > > >
> > > > > > > > From my understanding gcc is correctly putting a const volatile variable
> > > > > > > > in .data while llvm is incorrectly putting it in .rodata, is the gcc behavior
> > > > > > > > here invalid or is the llvm behavior invalid?
> > > > > > >
> > > > > > > From link you left to C standard, it does seem like that side-note
> > > > > > > *implies* that const volatile should be put into .data, but it's a)
> > > > > > > implied b) is quite arguable about assumptions that this data has to
> > > > > > > be in modifiable section, and c) entire CO-RE feature detection and
> > > > > > > guarding relies on having `const volatile` variables in .rodata and
> > > > > > > mark them as read-only for BPF verifier to allow dead code
> > > > > > > elimination. Changing c) would break entire CO-RE ecosystem.
> > > > > >
> > > > > > Well it does appear that llvm is fixing the behavior to be in line with gcc:
> > > > > > https://reviews.llvm.org/D131012
> > > > > >
> > > > > > >
> > > > > > > Seems like this issue was raised by Ulrich Drepper back in 2005 ([0])
> > > > > > > and he was also confused about GCC's behavior, btw.
> > > > > > >
> > > > > > > So either way, at the very least for BPF target we can't change this
> > > > > > > and I still think it's more logical to put const variables into
> > > > > > > .rodata, regardless of side notes in C standard.
> > > > > >
> > > > > > GCC does put const variables in .rodata, just not const volatile variables.
> > > > > >
> > > > >
> > > > > It's still const. Volatile doesn't change constness of a variable.
> > > > > We've been discussing this over and over, it gets a bit tiring, tbh.
> > > > >
> > > > > Aaron Ballman seems to agree, quoting him from [0]:
> > > > >
> > > > >   The footnote cited isn't a normative requirement (I can't see any
> > > > >   normative requirements that say we can't put a const volatile object
> > > > >   into a read only section), so it's debatable just how much of a bug
> > > > >   this is from a standards conformance perspective. I will have to
> > > > >   inquire on the WG14 reflectors to see if there's something I've
> > > > >   missed but I believe that C17 6.7.3p5 is meant to point out that
> > > > >   only lvalue are qualified; rvalues are not because lvalue conversion
> > > > >   strips the qualifiers. I think the footnote is mostly a nod towards
> > > > >   the fact that volatile qualified objects may change their value in
> > > > >   ways unknown to the compiler so storing it in a read-only section
> > > > >   of memory is a bit questionable. But I'm curious if the committee
> > > > >   tells me I've missed something there.
> > > > >
> > > > >
> > > > >   [0] https://github.com/llvm/llvm-project/issues/56468#issuecomment-1203146308
> > > > >
> > > > > > >
> > > > > > >
> > > > > > > As for systemd's program and its is_allow_list ([1]), to unblock
> > > > > > > yourself you can drop const because systemd doesn't rely on read-only
> > > > > > > guarantees of that variable anyways. It's much more critical in
> > > > > > > feature-detection use cases. But let's try to converge discussion in
> > > > > > > one place (preferably here), it's quite inconvenient to either reply
> > > > > > > the same thing twice here and on Github, or cross-reference lore and
> > > > > > > Github.
> > > > > >
> > > > > > Hmm, are you sure:
> > > > > > const __u8 is_allow_list SEC(".rodata") = 0;
> > > > > >
> > > > > > doesn't provide equivalent behavior to:
> > > > > > const volatile __u8 is_allow_list = 0;
> > > > >
> > > > > Is there anything preventing you from experimenting with this yourself?..
> > > >
> > > > Wasn't sure how to reproduce the issue properly.
> > > >
> > >
> > > Then please start asking specific questions about how to repro issues
> > > we are talking about, instead of randomly throwing various ideas
> > > around and seeing what sticks.
> >
> > I tested with the reproducer to see if any attribute based workarounds
> > might avoid the need for volatile, and I didn't find anything.
> >
> > >
> > > > >
> > > > > Let's see:
> > > > >
> > > > > $ git diff
> > > > > diff --git a/tools/testing/selftests/bpf/prog_tests/skeleton.c
> > > > > b/tools/testing/selftests/bpf/prog_tests/skeleton.c
> > > > > index 99dac5292b41..386785a45af0 100644
> > > > > --- a/tools/testing/selftests/bpf/prog_tests/skeleton.c
> > > > > +++ b/tools/testing/selftests/bpf/prog_tests/skeleton.c
> > > > > @@ -31,6 +31,8 @@ void test_skeleton(void)
> > > > >         if (CHECK(skel->kconfig, "skel_kconfig", "kconfig is mmaped()!\n"))
> > > > >                 goto cleanup;
> > > > >
> > > > > +       skel->rodata->should_trap = false;
> > > > > +
> > > > >         bss = skel->bss;
> > > > >         data = skel->data;
> > > > >         data_dyn = skel->data_dyn;
> > > > > diff --git a/tools/testing/selftests/bpf/progs/test_skeleton.c
> > > > > b/tools/testing/selftests/bpf/progs/test_skeleton.c
> > > > > index 1a4e93f6d9df..eee26bc82525 100644
> > > > > --- a/tools/testing/selftests/bpf/progs/test_skeleton.c
> > > > > +++ b/tools/testing/selftests/bpf/progs/test_skeleton.c
> > > > > @@ -26,6 +26,8 @@ const volatile struct {
> > > > >         const int in6;
> > > > >  } in = {};
> > > > >
> > > > > +const bool should_trap SEC(".rodata") = true;
> > > > > +
> > > > >  /* .data section */
> > > > >  int out1 = -1;
> > > > >  long long out2 = -1;
> > > > > @@ -58,6 +60,10 @@ int handler(const void *ctx)
> > > > >  {
> > > > >         int i;
> > > > >
> > > > > +       while (should_trap)
> > > > > +       {
> > > > > +       }
> > > > > +
> > > > >         out1 = in1;
> > > > >         out2 = in2;
> > > > >         out3 = in3;
> > > > >
> > > > >
> > > > > Result: selftests stop compiling. Why?
> > > > >
> > > > > $ llvm-objdump -d test_skeleton.linked1.o
> > > > >
> > > > > test_skeleton.linked1.o:        file format elf64-bpf
> > > > >
> > > > > Disassembly of section raw_tp/sys_enter:
> > > > >
> > > > > 0000000000000000 <handler>:
> > > > >        0:       05 00 ff ff 00 00 00 00 goto -1 <handler>
> > > > >
> > > > > Because compiler optimized away everything with while (true) {}. There
> > > > > is no global data at all anymore.
> > > > >
> > > > > This is what I keep telling you, const volatile is the only thing
> > > > > preventing the compiler from making wrong assumptions. And neither
> > > > >
> > > > > const bool should_trap SEC(".rodata") = true;
> > > > >
> > > > > nor
> > > > >
> > > > > const bool should_trap = true;
> > > > >
> > > > > work.
> > > > >
> > > > > But
> > > > >
> > > > > const volatile bool should_trap = true;
> > > > >
> > > > > does work.
> > > > >
> > > > > If you have some more speculative proposals, please investigate them
> > > > > yourself and report back with the results. But either way there is a
> > > > > lot of BPF code written with reliance on `const volatile`, both open
> > > > > source and not, so please stop actively trying to break BPF ecosystem
> > > > > with proposals like moving `const volatile` to .data, just because
> > > > > it's more convenient for you in GCC. Thank you.
> > > >
> > > > So it does at least seem I can force GCC to use .rodata like this for
> > > > systemd, maybe that's the best approach for now?:
> > > >
> > > > const volatile __u8 is_allow_list SEC(".rodata") = 0;
> > > >
> > > > It does emit an assembler warning but otherwise appears to work:
> > > > Generating src/core/bpf/restrict_ifaces/restrict-ifaces.bpf.unstripped.o
> > > > with a custom command
> > > > /tmp/ccM2b7jP.s: Assembler messages:
> > > > /tmp/ccM2b7jP.s:87: Warning: setting incorrect section attributes for .rodata
> > > >
> > > > I updated my systemd PR with that.
> > > >
> > >
> > > For Nth time, don't do this. Drop the const for systemd. And please
> > > start paying attention to feedback.
> >
> > Well I was looking for a more general solution for behavior convergence,
> > I posted here to see if that was a workable option to get both compilers
> > to behave the same for cases where we want the variable to be in
> > .rodata. I had also updated the pull request to make it easier to get
> > feedback on that approach.
>
> I've opened a pull request reverting this:
> https://github.com/systemd/systemd/pull/24881
>
> GCC const volatile .rodata section behavior should now match LLVM/clang.
>
> Details:
> https://github.com/gcc-mirror/gcc/commit/a0aafbc324aa90421f0ce99c6f5bbf64ed163da6
>

Great, thanks! Approved systemd PR (FWIW).

> >
> > I guess a developer at your company approved the PR already and
> > it got merged.
> >
> > >
> > > Please also start with learning how to build selftests/bpf with Clang
> > > and looking at various examples so that you can at least look at all
> > > the different features we rely on in the BPF world, instead of trying
> > > to tune one small piece (systemd's BPF programs) to your liking. If
> > > you are serious about making GCC BPF backend viable, you'll have to
> > > understand BPF a bit better.
> >
> > I did do that for both clang and gcc, I've been fixing issues I discovered
> > like this one for example when trying to build tests with gcc:
> > https://lore.kernel.org/bpf/20220803151403.793024-1-james.hilliard1@xxxxxxxxx/
> >
> > >
> > > That would be a better use of everyone's time, instead of you going
> > > behind our backs and requesting Clang to break the entire BPF
> > > ecosystem just because GCC is doing something differently, like you
> > > and Jose E. Marchesi did with [0] and [1].
> >
> > The behavior divergence here appears to predate BPF support in both
> > compilers from my understanding, the issue is being discussed in both
> > GCC and llvm issue trackers to try and figure out what the best way to
> > get behavior convergence would be, seems to make sense to me to
> > have this issue tracked by llvm and GCC.
> >
> > Explicitly setting the section seemed like it might be a good approach so
> > that in the event llvm did change the default things would still work the
> > same way.
> >
> > I don't think anyone is trying to go behind anyone's back here...I've tried
> > to cross reference issues in general so that everyone is aware of all the
> > discussions.
> >
> > >
> > >   [0] https://github.com/llvm/llvm-project/issues/56468
> > >   [1] https://reviews.llvm.org/D131012
> > >
> > > > >
> > > > > >
> > > > > > The used attribute in the SEC macro supposedly ensures that:
> > > > > > The compiler must emit the definition even if it appears to be unused, and
> > > > > > it must not apply optimizations which depend on fully understanding how
> > > > > > the entity is used.
> > > > > >
> > > > > > Or maybe the retain attribute along with used would be sufficient to allow
> > > > > > us to drop volatile in these cases?:
> > > > > > https://clang.llvm.org/docs/AttributeReference.html#retain
> > > > > >
> > > > > > >
> > > > > > >
> > > > > > >   [0] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=25521
> > > > > > >   [1] https://github.com/systemd/systemd/pull/24164#issuecomment-1203207372
> > > > > > >
> > > > > > > >
> > > > > > > > >
> > > > > > > > >   [0] https://github.com/systemd/systemd/pull/24164#issuecomment-1201806413
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > [...]




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux