BPF GCC status - Nov 2023

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



[During LPC 2023 we talked about improving communication between the GCC
 BPF toolchain port and the kernel side.  This is the first periodical
 report that we plan to publish in the GCC wiki and send to interested
 parties.  Hopefully this will help.]

GCC wiki page for the port: https://gcc.gnu.org/wiki/BPFBackEnd
IRC channel: #gccbpf at irc.oftc.net.
Help on using the port: gcc@xxxxxxxxxxx
Patches and/or development discussions: gcc-patches@xxxxxxx

Assembler
=========

- The BPF assembler was sometimes generating spurious symbols. The
  problem was that supporting the pseudo-C assembly syntax for BPF makes
  it impossible to use the traditional technique of hashing on
  mnemonics.  Instead, we are forced to attempt parsing entries in our
  opcodes table until some instruction template matches.  In some cases
  this was causing the parser to incorrectly parse part of an
  instruction opcode as an expression, which led to the creation of a
  new undefined symbol.

  David Faust installed a fix for this upstream:
  https://sourceware.org/pipermail/binutils/2023-November/130668.html

- gas: change meaning of ; in the BPF assembler.

  The clang assembler interprets semicolons as a statement/directive
  separator.  In the GNU BPF assembler that character was being
  interpreted as the beginning of a line comment, as it is usual in
  assembly languages.  We detected this discrepancy with snippets like:

	asm volatile ("					\
	if r1 >= 0 goto l0_%=;				\
	r0 = 1;						\
	r0 += 2;					\
l0_%=:	exit;						\
"	::: __clobber_all);

  We installed a patch upstream that makes GAS to behave like the clang
  assembler when interpreting semicolons in the assembly programs:
  Jose E. Marchesi
  https://sourceware.org/pipermail/binutils/2023-November/130867.html

  The simulator tests have been updated accordingly:
  Jose E. Marchesi
  https://sourceware.org/pipermail/gdb-patches/2023-November/204581.html

- In the Pseudo-C syntax register names are not preceded by % characters
  nor any other prefix.  A consequence of that is that in contexts like
  instruction operands, where both register names and expressions
  involving symbols are expected, there is no way to disambiguate
  between them.  GAS was allowing symbols like `w3' or `r5' in syntactic
  contexts where no registers were expected, such as in:

    r0 = w3 ll  ; GAS interpreted w3 as symbol, clang emits error

  The clang assembler wasn't allowing that.  During LPC we agreed that
  the simplest approach is to not allow any symbol to have the same name
  than a register, in any context.  So we changed GAS so it now doesn't
  allow to use register names as symbols in any expression, such as:

    r0 = w3 + 1 ll  ; This now fails for both GAS and llvm.
    r0 = 1 + w3 ll  ; NOTE this does not fail with llvm, but it should.

  We installed a patch in GAS for this.
  Jose E. Marchesi
  https://sourceware.org/pipermail/binutils/2023-November/130684.html

- Cupertino Miranda fixed a GAS bug in the parsing of registers in
  pseudo-c syntax mode:
  https://sourceware.org/pipermail/binutils/2023-November/130732.html

Compiler
========

 - Remove bpf-helpers.h.

   Now that we are finally able to use the kernel provided bpf_helpers.h
   file and associated machinery, there is no longer need to distribute
   our own version.

   Jose E. Marchesi
   https://gcc.gnu.org/pipermail/gcc-patches/2023-November/638226.html

 - Restore BPF build, always_inline in libgcc
   Jose E. Marchesi
   https://gcc.gnu.org/pipermail/gcc-patches/2023-November/637948.html

 - Fix expected regexp in gcc.target/bpf/ldxdw.c test
   Jose E. Marchesi
   https://gcc.gnu.org/pipermail/gcc-patches/2023-November/635892.html

 - Fix pseudoc-c asm emitted for *mulsidi3_zeroextend
   Jose E. Marchesi
   https://gcc.gnu.org/pipermail/gcc-patches/2023-November/635896.html

 - Corrected condition in core_mark_as_access index.
   Cupertino Miranda
   https://gcc.gnu.org/pipermail/gcc-patches/2023-November/636389.html

 - Delayed the removal of the parser enum plugin handler.
   Cupertino Miranda
   https://gcc.gnu.org/pipermail/gcc-patches/2023-November/636388.html

 - Force inlining __builtin_memcmp upto data sizes of 1024 bytes.
   Cupertino Miranda
   https://gcc.gnu.org/pipermail/gcc-patches/2023-November/636390.html

 - Emit errors for libcalls and builtin-generated libcalls, like clang
   does.
   Cupertino Miranda
   https://gcc.gnu.org/pipermail/gcc-patches/2023-November/638117.html

 - GCC was emitting funcall external declarations corresponding to
   attempted but eventually discarded code.  This happened for example
   when GCC tried some particular code that got discarded because there
   was another more performance alternative.  This was a problem with
   the BPF instruction set <= v3, because of lack of signed division.
   This is now fixed upstream.
   Jose E. Marchesi
   BZ 109253
   https://gcc.gnu.org/pipermail/gcc-patches/2023-November/638143.html

 - Indu Bhagat is investigating a BTF generation problem that is
   resulting in non-anonymous FUNC_PROTO entries, which are not allowed
   in BTF and rejected by the BPF loader.  This apparently happens when
   functions get inlined.

Pending Patches for bpf-next
============================

These are the current patches we still have to submit to bpf@vger for
bpf-next.  We are in the process of testing them:

- bpf: add more options for gcc-bpf to selftests/bpf/Makefile

  This patch passes the following extra options to BPF_GCC in
  GCC_BPF_BUILD_RULE:

  -masm=pseudoc
  -mco-re
  -Wno-unknown-pragmas
  -Wno-unused-variable
  -Wno-error=attributes
  -Wno-error=address-of-packed-member
  -Wno-compare-distinct-pointer-types
  -fno-strict-aliasing

  Most of them disable interpreting certain warnings as errors.  Code
  like:

    #define __imm_insn(name, expr) [name]"i"(*(long *)&(expr))

  where `expr' is something like a pointer to a bpf_insn, requires
  disabling strict aliasing, which is activated by default with -O2 in
  GCC.

- bpf: use r constraint instead of p constraint

  This was discussed in bpf@vger and it was decided that we would stop
  using the "p" constraint in the BPF kernel selftests.  That constraint
  is not really meant to be used externally to the compiler.

  https://lore.kernel.org/bpf/87edkbnq14.fsf@xxxxxxxxxx/

- bpf_core_read.h: GCC specific macro for preserve_enum_value

  This patch adds a version of the bpf_core_enum_value macro to be used
  by GCC.  The implementations of CO-RE built-ins in clang and GCC
  require different "magical expressions" to be passed to the built-ins.
  These macros hide the complexity from the user.

- bpf: avoid VLAs in progs/test_xdp_dynptr.c

  In the progs/test_xdp_dynptr.c there are a bunch of VLAs in the
  handle_ipv4 and handle_ipv6 functions:

    const size_t tcphdr_sz = sizeof(struct tcphdr);
    const size_t udphdr_sz = sizeof(struct udphdr);
    const size_t ethhdr_sz = sizeof(struct ethhdr);
    const size_t iphdr_sz = sizeof(struct iphdr);
    const size_t ipv6hdr_sz = sizeof(struct ipv6hdr);
    
    [...]
    
    static __always_inline int handle_ipv6(struct xdp_md *xdp, struct bpf_dynptr *xdp_ptr)
    {
	__u8 eth_buffer[ethhdr_sz + ipv6hdr_sz + ethhdr_sz];
	__u8 ip6h_buffer_tcp[ipv6hdr_sz + tcphdr_sz];
	__u8 ip6h_buffer_udp[ipv6hdr_sz + udphdr_sz];
  	[...]
    }
    
    static __always_inline int handle_ipv6(struct xdp_md *xdp, struct bpf_dynptr *xdp_ptr)
    {
  	__u8 eth_buffer[ethhdr_sz + ipv6hdr_sz + ethhdr_sz];
	__u8 ip6h_buffer_tcp[ipv6hdr_sz + tcphdr_sz];
	__u8 ip6h_buffer_udp[ipv6hdr_sz + udphdr_sz];
	[...]
    }

  In both GCC and clang we are not allowing dynamic stack allocation (we
  used to support it in GCC using one register as an auxiliary stack
  pointer, but not any longer).

  The above code builds with clang but not with GCC:

    progs/test_xdp_dynptr.c:79:14: error: BPF does not support dynamic stack allocation
       79 |         __u8 eth_buffer[ethhdr_sz + iphdr_sz + ethhdr_sz];
          |              ^~~~~~~~~~

  We are guessing that clang turns these arrays from VLAs into normal
  statically sized arrays because ethhdr_sz and friends are constant and
  set to sizeof, which is always known at compile time.  This patch
  changes the selftest to use preprocessor constants instead of
  variables:

    #define tcphdr_sz sizeof(struct tcphdr)
    #define udphdr_sz sizeof(struct udphdr)
    #define ethhdr_sz sizeof(struct ethhdr)
    #define iphdr_sz sizeof(struct iphdr)
    #define ipv6hdr_sz sizeof(struct ipv6hdr)

- bpf_helpers.h: define bpf_tail_call_static when building with GCC

- bpf: fix constraint in test_tcpbpf_kern.c

  GCC emits a warning:

    progs/test_tcpbpf_kern.c:60:9: error: ‘op’ is used uninitialized [-Werror=uninitialized]

  when the uninitialized automatic `op' is used with a "+r" constraint
  in:

	asm volatile (
		"%[op] = *(u32 *)(%[skops] +96)"
		: [op] "+r"(op)
		: [skops] "r"(skops)
		:);

  The constraint shall be "=r" instead.


Open Questions
==============

- BPF programs including libc headers.

  BPF programs run on their own without an operating system or a C
  library.  Implementing C implies providing certain definitions and
  headers, such as stdint.h and stdarg.h.  For such targets, known as
  "bare metal targets", the compiler has to provide these definitions
  and headers in order to implement the language.

  GCC provides the following C headers for BPF targets:

    float.h
    gcov.h
    iso646.h
    limits.h
    stdalign.h
    stdarg.h
    stdatomic.h
    stdbool.h
    stdckdint.h
    stddef.h
    stdfix.h
    stdint.h
    stdnoreturn.h
    syslimits.h
    tgmath.h
    unwind.h
    varargs.h

  However, we have found that there is at least one BPF kernel self test
  that include glibc headers that, indirectly, include glibc's own
  definitions of stdint.h and friends.  This leads to compile-time
  errors due to conflicting types.  We think that including headers from
  a glibc built for some host target is very questionable.  For example,
  in BPF a C `char' is defined to be signed.  But if a BPF program
  includes glibc headers in an android system, that code will assume an
  unsigned char instead.

Other Updates
=============

- Brian Witte has adapted the Waldo 80211 debug/test/trace wireless
  analyzer tool to be built with GCC BPF.  This includes CI that uses
  the latest GCC git version, which is quite useful for us.

  https://git.sr.ht/~brianwitte/waldo-80211

- Brian has also published a tested and documented very simple bpf
  program example, with the goal of providing an accessible and
  instructive example for those interested in BPF development with the
  GNU toolchain.

  https://git.sr.ht/~brianwitte/gcc-bpf-example





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux