Re: CO-RE builtins purity and other compiler optimizations

"Jose E. Marchesi" <jose.marchesi@xxxxxxxxxx> · Thu, 06 Jul 2023 11:03:59 +0200

> On Wed, Jul 5, 2023 at 11:07 AM Jose E. Marchesi
> <jose.marchesi@xxxxxxxxxx> wrote:
>>
>>
>> Hello BPF people!
>>
>> We are still working in supporting the pending CO-RE built-ins in GCC.
>> The trick of hooking in the parser to avoid constant folding, as
>> discussed during LSFMMBPF, seems to work well.  Almost there!
>>
>> So, most of the CO-RE associated C built-ins have the side effect of
>> emiting a CO-RE relocation in the .BTF.ext section.  This is for example
>> the case of __builtin_preserve_enum_value.
>>
>> Like calls to regular functions, calls to C built-ins are also
>> candidates to certain optimizations.  For example, given this code:
>>
>> : int a = __builtin_preserve_enum_value(*(typeof(enum E) *)eB, BPF_ENUMVAL_VALUE);
>> : int b = __builtin_preserve_enum_value(*(typeof(enum E) *)eB, BPF_ENUMVAL_VALUE);
>>
>> The compiler may very well decide to optimize out the second call to the
>> built-in if it is to be considered "pure", i.e. given exactly the same
>> arguments it produces the same results.
>>
>> We observed that clang indeed seems to optimize that way.  See
>> https://godbolt.org/z/zqe9Kfrrj.
>>
>> That kind of optimizations have an impact on the number of CO-RE
>> relocations emitted.
>>
>> Question:
>>
>> Is the BPF loader, the BPF verifier or any other core component sensible
>> in any way to the number (and ordering) of CO-RE relocations for some
>> given BPF C program?  i.e. compiling the same BPF C program above with
>> and without that optimization, will it work in both cases?
>
> Yes, it should.
>
>>
>> If no, then perfect!  Different compilers can optimize slightly
>
> Did you mean "if yes, then perfect"? Because otherwise it makes no sense :)

Yeah I was referring to the first question not the second :)

>> differently (or not optimize at all) and we can mark these built-ins as
>> pure in GCC as well, benefiting from optimizations without worrying to
>> have to emit exactly what clang emits.
>
> Yes, it should be fine, as long as the compiler doesn't assume any
> specific value returned by CO-RE relocation (and doesn't perform any
> optimizations based on that assumed value). From the BPF verifier
> side, it's just a constant, so the BPF verifier itself doesn't care.
> From the libbpf/BPF loader standpoint, all that matters is that there
> is CO-RE relocation information that specifies how some BPF
> instruction needs to be adjusted to match the host kernel properly.
> Whether CO-RE relocation is repeated many times, or performed just
> once and that constant value is just reused in the code many times,
> shouldn't matter at all.

Ok, this is good.  Thanks for confirming!

>>
>> If yes, wouldn't it be better to disable that kind of optimization in
>> all C BPF compilers, i.e. to make the compilers aware of the side-effect
>> so they will not optimize built-in calls out (or replicate them.) and to
>> make this mandatory in the CO-RE spec?  Making a compiler to optimize
>> exactly like another compiler is difficult and sometimes even not
>> feasible.
>>
>> Thanks in advance for the clarification/info!
>>