On Thu, Sep 12, 2019 at 09:26:57PM +0200, Jose E. Marchesi wrote: > > Hi people! > > First of all, many thanks for the lots of feedback I got at the LPC > conference. It was very useful and made these days totally worth it :) > > In order to advance in the direction of having a single bpf_helpers.h > header that works with both llvm and gcc, I would like to suggest a few > changes for the kernel's header. > > Kernel helpers > -------------- > > First, there is the issue of kernel helpers. You people made it very > clear at LPC that having a compiler built-in function per kernel helper > is way too restrictive, since it makes it impossible to use new kernel > helpers without patching the compiler. I agree. > > However, I still think that the function pointer hack currently used in > bpf_helpers.h is way too fragile, depending on the optimization level > and the particular behavior of the compiler. > > Thinking about a more robust and flexible solution, today I wrote a > patch for GCC that adds a target-specific function attribute: > > __attribute__ ((kernel_helper (NUM))) > > Then I changed my bpf-helpers.h to define the kernel helpers like: > > void *bpf_map_lookup_elem (void *map, const void *key) > __attribute__ ((kernel_helper (1))); > > This new mechanism allows the user to mark any function prototype as a > kernel helper, so the flexibility is total. It also allowed me to get > rid of the table of helpers in the GCC backend proper, which is awesome > :) > > Would you consider implementing this attribute in llvm and adapt the > kernel's bpf_header accordingly? In that respect, note that it is > possible to pass enum entries to the attribute (at least in GCC.) So > you once you implement the attribute, you should be able to do: > > void *bpf_map_lookup_elem (void *map, const void *key) > __attribute__ ((kernel_helper (BPF_FUNC_map_lookup_elem))); > > instead of the current: > > static void *(*bpf_map_lookup_elem)(void *map, const void *key) = > (void *) BPF_FUNC_map_lookup_elem; What we've been using for long time is not exactly normal C code, but it's a valid C code that any compiler and any backend should consume and generate the code prescribed by the language. Here it says that it's a function pointer with fixed offset. x86 backends in both clang and gcc do the right thing. I don't understand why it's causing bpf backend for gcc to stumble. You mentioned "helper table in gcc bpf backend". That sounds like a red flag. The backend should not know names and numbers for helpers. For the following: static void (*foo)(void) = (void *) 123; int bar() { foo(); } It should generate bpf instruction 'bpf_call 123', since that's what C language is asking compiler to do. It is as you pointed out 'fragile', since it won't work with -O0, but that sort of the point. -O0 is too debuggy and un-optimized that even without this call insn quirk the verifier is not able to analyze even simple programs. Hence -O2 was a requirement for bpf development due to verifier smartness. One can argue that the verifier should become even smarter and analyze -O0 code, but I would argue otherwise. Linux kernel itself won't work with -O0. Same reasoning applies to bpf code. The main purpose of -O0 for user space development is to produce code together with -g that debugger can understand. The variables will stay on stack, line numbers will be intact, etc For bpf program development that's anti pattern. The 'bpf debugging' topic at plumbers showed that we still has a long way to go to make bpf debugging better. Single step, execution trace, nested bpf, etc All that will come, but -O0 support will not. > Please let me know what do you think. > > SKB load built-ins > ------------------ > > bpf_helpers.h contains the following llvm-isms: > > /* llvm builtin functions that eBPF C program may use to > * emit BPF_LD_ABS and BPF_LD_IND instructions > */ > struct sk_buff; > unsigned long long load_byte(void *skb, > unsigned long long off) asm("llvm.bpf.load.byte"); > unsigned long long load_half(void *skb, > unsigned long long off) asm("llvm.bpf.load.half"); > unsigned long long load_word(void *skb, > unsigned long long off) asm("llvm.bpf.load.word"); > > Would you consider adopting more standard built-ins in llvm, like I > implemented in GCC? These are: > > __builtin_bpf_load_byte (unsigned long long off) > __builtin_bpf_load_half (unsigned long long off) > __builtin_bpf_load_word (unsigned long long off) > > Note that I didn't add an SKB argument to the builtins, as it is not > used: the pointer to the skb is implied by the instructions to be in > some predefined register. I added compatibility wrappers in my > bpf-helpers.h: > > #define load_byte(SKB,OFF) __builtin_bpf_load_byte ((OFF)) > #define load_half(SKB,OFF) __builtin_bpf_load_half ((OFF)) > #define load_word(SKB,OFF) __builtin_bpf_load_word ((OFF)) > > Would you consider removing the unused SKB arguments from the built-ins > in llvm? Or is there a good reason for having them, other than maybe > backwards compatibility? In case backwards compatibility is a must, I > can add the unused argument to my builtins. llvm.bpf.load.word consumes 'skb' pointer. llvm bpf backend makes sure that it's in R6 before LD_ABS insn. __builtin_bpf_load_byte (unsigned long long off) cannot produce correct code. As far as doing it as __builtin_bpf_load_word(skb, off) instead of asm("llvm.bpf.load.word") that's fine, of course. Here compilers don't have to be the same. #ifdef clang vs gcc in bpf_helpers.h should do it. Also please get rid of bpf-helpers.h from gcc tree. There shouldn't be such things shipped with compiler.