Re: CO-RE builtins purity and other compiler optimizations

Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> · Wed, 5 Jul 2023 17:02:42 -0700

On Wed, Jul 5, 2023 at 11:07 AM Jose E. Marchesi
<jose.marchesi@xxxxxxxxxx> wrote:
>
>
> Hello BPF people!
>
> We are still working in supporting the pending CO-RE built-ins in GCC.
> The trick of hooking in the parser to avoid constant folding, as
> discussed during LSFMMBPF, seems to work well.  Almost there!
>
> So, most of the CO-RE associated C built-ins have the side effect of
> emiting a CO-RE relocation in the .BTF.ext section.  This is for example
> the case of __builtin_preserve_enum_value.
>
> Like calls to regular functions, calls to C built-ins are also
> candidates to certain optimizations.  For example, given this code:
>
> : int a = __builtin_preserve_enum_value(*(typeof(enum E) *)eB, BPF_ENUMVAL_VALUE);
> : int b = __builtin_preserve_enum_value(*(typeof(enum E) *)eB, BPF_ENUMVAL_VALUE);
>
> The compiler may very well decide to optimize out the second call to the
> built-in if it is to be considered "pure", i.e. given exactly the same
> arguments it produces the same results.
>
> We observed that clang indeed seems to optimize that way.  See
> https://godbolt.org/z/zqe9Kfrrj.
>
> That kind of optimizations have an impact on the number of CO-RE
> relocations emitted.
>
> Question:
>
> Is the BPF loader, the BPF verifier or any other core component sensible
> in any way to the number (and ordering) of CO-RE relocations for some
> given BPF C program?  i.e. compiling the same BPF C program above with
> and without that optimization, will it work in both cases?

Yes, it should.

>
> If no, then perfect!  Different compilers can optimize slightly

Did you mean "if yes, then perfect"? Because otherwise it makes no sense :)

> differently (or not optimize at all) and we can mark these built-ins as
> pure in GCC as well, benefiting from optimizations without worrying to
> have to emit exactly what clang emits.

Yes, it should be fine, as long as the compiler doesn't assume any
specific value returned by CO-RE relocation (and doesn't perform any
optimizations based on that assumed value). From the BPF verifier
side, it's just a constant, so the BPF verifier itself doesn't care.
>From the libbpf/BPF loader standpoint, all that matters is that there
is CO-RE relocation information that specifies how some BPF
instruction needs to be adjusted to match the host kernel properly.
Whether CO-RE relocation is repeated many times, or performed just
once and that constant value is just reused in the code many times,
shouldn't matter at all.

>
> If yes, wouldn't it be better to disable that kind of optimization in
> all C BPF compilers, i.e. to make the compilers aware of the side-effect
> so they will not optimize built-in calls out (or replicate them.) and to
> make this mandatory in the CO-RE spec?  Making a compiler to optimize
> exactly like another compiler is difficult and sometimes even not
> feasible.
>
> Thanks in advance for the clarification/info!
>