Re: libbpf/bpftool inconsistent handling og .data and .bss ?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 10/7/20 3:26 PM, Andrii Nakryiko wrote:
On Wed, Oct 7, 2020 at 2:29 PM Luigi Rizzo <lrizzo@xxxxxxxxxx> wrote:

On Wed, Oct 7, 2020 at 10:40 PM Andrii Nakryiko
<andrii.nakryiko@xxxxxxxxx> wrote:

On Wed, Oct 7, 2020 at 1:31 PM Luigi Rizzo <lrizzo@xxxxxxxxxx> wrote:

TL;DR; there seems to be a compiler bug with clang-10 and -O2
when struct are in .data -- details below.

On Wed, Oct 7, 2020 at 8:35 PM Andrii Nakryiko
<andrii.nakryiko@xxxxxxxxx> wrote:

On Wed, Oct 7, 2020 at 9:03 AM Luigi Rizzo <rizzo@xxxxxxxxxxxx> wrote:

I am experiencing some weirdness in global variables handling
in bpftool and libbpf, as described below.
...
2. .bss overrides from userspace are not seen in bpf at runtime

     In foo_bpf.c I have "int x = 0;"
     In the userspace program, before foo_bpf__load(), I do
        obj->bss->x = 1
     but after attach, the bpf code does not see the change, ie
         "if (x == 0) { .. } else { .. }"
     always takes the first branch.

     If I initialize "int x = 2" and then do
        obj->data->x = 1
     the update is seen correctly ie
           "if (x == 2) { .. } else { .. }"
      takes one or the other depending on whether userspace overrides
      the value before foo_bpf__load()

This is quite surprising, given we have explicit selftests validating
that all this works. And it seems to work. Please check
prog_tests/skeleton.c and progs/test_skeleton.c. Can you try running
it and confirm that it works in your setup?

Ah, this was non intuitive but obvious in hindsight:

.bss is zeroed by the kernel after load(), and since my program
changed the value before foo_bpf__load() , the memory was overwritten
with 0s. I could confirm this by printing the value after load.

If I update obj->data-><something> after __load(),
or even after __attach() given that userspace mmaps .bss and .data,
everything works as expected both for scalars and structs.

Check prog_tests/skeleton.c again, it sets .data, .bss, and .rodata
before the load. And checks that those values are preserved after
load. So .bss, if you initialize it manually, shouldn't zero-out what
you set.

Don't know what to say: it is cleared on my laptop 5.7.17

I printed the values around assignments and calls
(also verified that obj->bss does not change):
Below, x is "uint32_t x = 0" in .bss
struct one { uint32_t a } s = { .a = 2} " in .data
Program output:

before load, obj->bss is 0x7fb0698b6000
initially x is 0 s.a is 2
// x = 1; s.a = 3
before load x is 1 s.a is 3
after load, obj->bss is 0x7fb0698b6000
after load x is 0 s.a is 3 // note x is cleared, s is left alone
// x = 2; s.a = 4;
after assign x is 2 s.a is 4 variables by 10 every 5ms
// attach, when the program runs (every 5ms) does
// if (s.a == 2 || s.a > 10) { x += 10; s.a += 10}
after attach x is 12 s.a is 12
at runtime count_off is 2382 x is 12
at runtime count_off is 2382 x is 12
...

Could it be some security setting ?




3. .data overrides do not seem to work for non-scalar types
     In foo_bpf.c I have
           struct one { int a; }; // type also visible to userspace
           struct one x { .a = 2 }; // avoid bugs #1 and #2
     If in userspace I do
           obj->data->x.a = 1
     the update is not seen in the kernel, ie
             "if (x.a == 2) { .. } else { .. }"
      always takes the first branch


Similarly, the same skeleton selftest tests this situation. So please
check selftests first and report if selftests for some reason don't
work in your case.

Actually test_skeleton.c does _not_ test for struct in .data,
only in .rodata and .bss

It doesn't matter which section it's in, I meant it's testing struct
field accesses from at least one of global data sections.

Right but as the llvm-objdump shows, the compiler is treating
.bss and .data differently, at least for struct reads.



There seems to be a compiler error, at least with clang-10 and -O2

Note how the struct case the compiler uses '2' as immediate value
when reading, whereas in the scalar case it correctly dereferences
the pointer to the variable

It would be useful to include your original source code, especially
the variable declaration parts. I suspect that you declared your
struct variable as a static variable? In that case Clang will assume
nothing can change the value and can inline values like 2. So either
make sure you have a global variable declaration or use `static
volatile`. See how `const volatile` is used throughout all selftests
when working with the .rodata section.

Perhaps the easiest is to see it on godbolt:

https://godbolt.org/z/Mnx38v


Thanks for the example. I can also reproduce this locally. It does
seem like a Clang/LLVM bug at this point. The generated code makes
absolutely no sense to me:

r1 = 100
if r1 > 3 goto +5
r1 = 3
r1 += 111

Something fishy is going on there. I bet Yonghong will quickly figure
out what's going on.

Thanks a lot for the test! This exposed a serious bug in llvm backend
for load optimization. The original opt pass is implemented in 2017
with local variables with initializer. Now *more* uses of global variables exposed additional bugs. I have posted a fix at
   https://reviews.llvm.org/D89021


BTW, I tried `static volatile` for the variable, marking volatile
field a, marking variable as __attribute__((weak)). Nothing really
helps, generated code is still weird and inlines constants.

and how clang gets terribly confused when compiling read access
to the struct_in_data field

cheers
luigi



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux