[RFC bpf-next 00/12] Use uapi kernel headers with vmlinux.h

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi BPF community,

AFAIK there is a long standing feature request to use kernel headers
alongside `vmlinux.h` generated by `bpftool`. For example significant
effort was put to add an attribute `bpf_dominating_decl` (see [1]) to
clang, unfortunately this effort was stuck due to concerns regarding C
language semantics.

After some discussion with Alexei and Yonghong I'd like to request
your comments regarding a somewhat brittle and partial solution to
this issue that relies on adding `#ifndef FOO_H ... #endif` guards in
the generated `vmlinux.h`.

The basic idea
---

The goal of the patch set is to allow usage of header files from
`include/uapi` alongside `vmlinux.h` as follows:

  #include <uapi/linux/tcp.h>
  #include "vmlinux.h"

This goal is achieved by adding `#ifndef ... #endif` guards in
`vmlinux.h` around definitions that originate from the `include/uapi`
headers. The guards emitted match the guards used in the original
headers. E.g. as follows:

include/uapi/linux/tcp.h:

  #ifndef _UAPI_LINUX_TCP_H
  #define _UAPI_LINUX_TCP_H
  ...
  union tcp_word_hdr {
	struct tcphdr hdr;
	__be32        words[5];
  };
  ...
  #endif /* _UAPI_LINUX_TCP_H */

vmlinux.h:

  ...
  #ifndef _UAPI_LINUX_TCP_H

  union tcp_word_hdr {
	struct tcphdr hdr;
	__be32 words[5];
  };

  #endif /* _UAPI_LINUX_TCP_H */
  ...

To get to this state the following steps are necessary:
- "header guard" name should be identified for each header file;
- the correspondence between data type and it's header guard has to be
  encoded in BTF;
- `bpftool` should be adjusted to emit `#ifndef FOO_H ... #endif`
  brackets.

It is not possible to identify header guard names for all uapi headers
basing only on the file name. However a simple script could devised to
identify the guards basing on the file name and it's content. Thus it
is possible to obtain the list of header names with corresponding
header guards.

The correspondence between type and it's declaration file (header) is
available in DWARF as `DW_AT_decl_file` attribute. The
`DW_AT_decl_file` can be matched with the list of header guards
described above to obtain the header guard name for a specific type.

The `pahole` generates BTF using DWARF. It is possible to modify
`pahole` to accept the header guards list as an additional parameter
and to encode the header guard names in BTF.

Implementation details
---

Present patch-set implements these ideas as follows:
- A parameter `--header_guards_db` is added to `pahole`. If present it
  points to a file with a list of `<header> <guard>` records.
- `pahole` uses DWARF `DW_AT_decl_file` value to lookup the header
  guard for each type emitted to BTF. If header guard is present it is
  encoded alongside the type.
- Header guards are encoded in BTF as `BTF_DECL_TAG` records with a
  special prefix. The prefix "header_guard:" is added to a value of
  such tags. (Here `BTF_DECL_TAG` is used to avoid BTF binary format
  changes).
- A special script `infer_header_guards.pl` is added as a part of
  kbuild, it can infer header guard names for each UAPI header basing
  on the header content.
- This script is invoked from `link-vmlinux.sh` prior to BTF
  generation during kernel build. The output of the script is saved to
  a file, the file is passed to `pahole` as `--header_guards_db`
  parameter.
- `libbpf` is modified to aggregate `BTF_DECL_TAG` records for each
  type and to emit `#ifndef FOO_H ... #endif` brackets when
  "header_guard:" tag is present for a type.

Details for each patch in a set:
- libbpf: Deduplicate unambigous standalone forward declarations
- selftests/bpf: Tests for standalone forward BTF declarations deduplication

  There is a small number (63 for defconfig) of forward declarations
  that are not de-duplicated with the main type declaration under
  certain conditions. This hinders the header guard brackets
  generation. This patch addresses this de-duplication issue.

- libbpf: Support for BTF_DECL_TAG dump in C format
- selftests/bpf: Tests for BTF_DECL_TAG dump in C format

  Currently libbpf does not process BTF_DECL_TAG when btf is dumped in
  C format. This patch adds a hash table matching btf type ids with a
  list of decl tags to the struct btf_dump.
  The `btf_dump_emit_decl_tags` is not necessary for the overall
  patch-set to function but simplifies testing a bit.

- libbpf: Header guards for selected data structures in vmlinux.h
- selftests/bpf: Tests for header guards printing in BTF dump

  Adds option `emit_header_guards` to `struct btf_dump_opts`.
  When enabled the `btf_dump__dump_type` prints `#ifndef ... #endif`
  brackets around types for which header guard information is present
  in BTF.

- bpftool: Enable header guards generation

  Unconditionally enables `emit_header_guards` for BTF dump in C format.

- kbuild: Script to infer header guard values for uapi headers
- kbuild: Header guards for types from include/uapi/*.h in kernel BTF

  Adds `scripts/infer_header_guards.pl` and integrates it with
  `link-vmlinux.sh`.

- selftests/bpf: Script to verify uapi headers usage with vmlinux.h

  Adds a script `test_uapi_headers.py` that tests header guards with
  vmlinux.h by compiling a simple C snippet. The snippet looks as
  follows:
  
    #include <some_uapi_header.h>
    #include "vmlinux.h"
  
    __attribute__((section("tc"), used))
    int syncookie_tc(struct __sk_buff *skb) { return 0; }
  
  The list of headers to test comes from
  `tools/testing/selftests/bpf/good_uapi_headers.txt`.

- selftests/bpf: Known good uapi headers for test_uapi_headers.py

  The list of uapi headers that could be included alongside vmlinux.h.
  The headers are peeked from the following locations:
  - <headers-export-dir>/linux/*.h
  - <headers-export-dir>/linux/**/*.h
  This choice of locations is somewhat arbitrary.

- selftests/bpf: script for infer_header_guards.pl testing

  The test case for `scripts/infer_header_guards.pl`, verifies that
  header guards can be inferred for all uapi headers.

- There is also a patch for dwarves that adds `--header_guards_db`
  option (see [2]).

The `test_uapi_headers.py` is important as it demonstrates the
the necessary compiler flags:

clang ...                                  \
      -D__x86_64__                         \
      -Xclang -fwchar-type=short           \
      -Xclang -fno-signed-wchar            \
      -I{exported_kernel_headers}/include/ \
      ...

- `-fwchar-type=short` and `-fno-signed-wchar` had to be added because
  BPF target uses `int` for `wchar_t` by default and this differs from
  `vmlinux.h` definition of the type (at least for x86_64).
- `__x86_64__` had to be added for uapi headers that include
  `stddef.h` (the one that is supplied my CLANG itself), in order to
  define correct sizes for `size_t` and `ptrdiff_t`.
- The `{exported_kernel_headers}` stands for exported kernel headers
  directory (the headers obtained by `make headers_install` or via
  distribution package).

When it works
---

The mechanics described above works for a significant number of UAPI
headers. For example, for the test case above I chose the headers from
the following locations:
- linux/*.h
- linux/**/*.h
There are 759 such headers and for 677 of them the test described
above passes.

I excluded the headers from the following sub-directories as
potentially not interesting:

  asm          rdma   video xen
  asm-generic  misc   scsi
  drm          mtd    sound

Thus saving some time for both discussion and CI but the choice is
somewhat arbitrary. If I run `test_uapi_headers.py --test '*'` (all
headers) test passes for 834 out of 972 headers.

When it breaks
---

There several scenarios when this mechanics breaks.
Specifically I found the following cases:
- When uapi header includes some system header that conflicts with
  vmlinux.h.
- When uapi header itself conflicts with vmlinux.h.

Below are examples for both cases.

Conflict with system headers
----

The following uapi headers:
- linux/atmbr2684.h
- linux/bpfilter.h
- linux/gsmmux.h
- linux/icmp.h
- linux/if.h
- linux/if_arp.h
- linux/if_bonding.h
- linux/if_pppox.h
- linux/if_tunnel.h
- linux/ip6_tunnel.h
- linux/llc.h
- linux/mctp.h
- linux/mptcp.h
- linux/netdevice.h
- linux/netfilter/xt_RATEEST.h
- linux/netfilter/xt_hashlimit.h
- linux/netfilter/xt_physdev.h
- linux/netfilter/xt_rateest.h
- linux/netfilter_arp/arp_tables.h
- linux/netfilter_arp/arpt_mangle.h
- linux/netfilter_bridge.h
- linux/netfilter_bridge/ebtables.h
- linux/netfilter_ipv4/ip_tables.h
- linux/netfilter_ipv6/ip6_tables.h
- linux/route.h
- linux/wireless.h

Include the following system header:
- /usr/include/sys/socket.h (all via linux/if.h)

The sys/socket.h conflicts with vmlinux.h in:
- types: struct iovec, struct sockaddr, struct msghdr, ...
- constants: SOCK_STREAM, SOCK_DGRAM, ...

However, only two types are actually used:
- struct sockaddr
- struct sockaddr_storage (used only in linux/mptcp.h)

In 'vmlinux.h' this type originates from 'kernel/include/socket.h'
(non UAPI header), thus does not have a header guard.

The only workaround that I see is to:
- define a stub sys/socket.h as follows:

    #ifndef __BPF_SOCKADDR__
    #define __BPF_SOCKADDR__
    
    /* For __kernel_sa_family_t */
    #include <linux/socket.h>
    
    struct sockaddr {
        __kernel_sa_family_t sa_family;
        char sa_data[14];
    };
    
    #endif

- hardcode generation of __BPF_SOCKADDR__ bracket for
  'struct sockaddr' in vmlinux.h.

Another possibility is to move the definition of 'struct sockaddr'
from 'kernel/include/socket.h' to 'kernel/include/uapi/linux/socket.h',
but I expect that this won't fly with the mainline as it might break
the programs that include both 'linux/socket.h' and 'sys/socket.h'.

Conflict with vmlinux.h
----

Uapi header:
- linux/signal.h

Conflict with vmlinux.h in definition of 'struct sigaction'.
Defined in:
- vmlinux.h: kernel/include/linux/signal_types.h
- uapi:      kernel/arch/x86/include/asm/signal.h

Uapi headers:
- linux/tipc_sockets_diag.h
- linux/sock_diag.h

Conflict with vmlinux.h in definition of 'SOCK_DESTROY'.
Defined in:
- vmlinux.h: kernel/include/net/sock.h
- uapi:      kernel/include/uapi/linux/sock_diag.h
Constants seem to be unrelated.

And so on... I have details for many other headers but omit those for
brevity.

In conclusion
---

Except from the general feasibility I have a few questions:
- What UAPI headers are the candidates for such use? If there are some
  interesting headers currently not working with this patch-set some
  hacks have to be added (e.g. like with `linux/if.h`).
- Is it ok to encode header guards as special `BTF_DECL_TAG` or should
  I change the BTF format a bit to save some bytes.

Thanks,
Eduard


[1] https://reviews.llvm.org/D111307
    [clang] __attribute__ bpf_dominating_decl
[2] https://lore.kernel.org/dwarves/20221025220729.2293891-1-eddyz87@xxxxxxxxx/T/
    [RFC dwarves] pahole: Save header guard names when
                  --header_guards_db is passed

Eduard Zingerman (12):
  libbpf: Deduplicate unambigous standalone forward declarations
  selftests/bpf: Tests for standalone forward BTF declarations
    deduplication
  libbpf: Support for BTF_DECL_TAG dump in C format
  selftests/bpf: Tests for BTF_DECL_TAG dump in C format
  libbpf: Header guards for selected data structures in vmlinux.h
  selftests/bpf: Tests for header guards printing in BTF dump
  bpftool: Enable header guards generation
  kbuild: Script to infer header guard values for uapi headers
  kbuild: Header guards for types from include/uapi/*.h in kernel BTF
  selftests/bpf: Script to verify uapi headers usage with vmlinux.h
  selftests/bpf: Known good uapi headers for test_uapi_headers.py
  selftests/bpf: script for infer_header_guards.pl testing

 scripts/infer_header_guards.pl                | 191 +++++
 scripts/link-vmlinux.sh                       |  13 +-
 tools/bpf/bpftool/btf.c                       |   4 +-
 tools/lib/bpf/btf.c                           | 178 ++++-
 tools/lib/bpf/btf.h                           |   7 +-
 tools/lib/bpf/btf_dump.c                      | 232 +++++-
 .../selftests/bpf/good_uapi_headers.txt       | 677 ++++++++++++++++++
 tools/testing/selftests/bpf/prog_tests/btf.c  | 152 ++++
 .../selftests/bpf/prog_tests/btf_dump.c       |  11 +-
 .../bpf/progs/btf_dump_test_case_decl_tag.c   |  39 +
 .../progs/btf_dump_test_case_header_guards.c  |  94 +++
 .../bpf/test_uapi_header_guards_infer.sh      |  33 +
 .../selftests/bpf/test_uapi_headers.py        | 197 +++++
 13 files changed, 1816 insertions(+), 12 deletions(-)
 create mode 100755 scripts/infer_header_guards.pl
 create mode 100644 tools/testing/selftests/bpf/good_uapi_headers.txt
 create mode 100644 tools/testing/selftests/bpf/progs/btf_dump_test_case_decl_tag.c
 create mode 100644 tools/testing/selftests/bpf/progs/btf_dump_test_case_header_guards.c
 create mode 100755 tools/testing/selftests/bpf/test_uapi_header_guards_infer.sh
 create mode 100755 tools/testing/selftests/bpf/test_uapi_headers.py

-- 
2.34.1




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux