On 25/10/2022 23:27, Eduard Zingerman wrote: > Hi BPF community, > > AFAIK there is a long standing feature request to use kernel headers > alongside `vmlinux.h` generated by `bpftool`. For example significant > effort was put to add an attribute `bpf_dominating_decl` (see [1]) to > clang, unfortunately this effort was stuck due to concerns regarding C > language semantics. > Thanks for the details! It's a tricky problem to solve. Before diving into this, can I ask if there's another way round this; is there no way we could teach vmlinux.h generation which types to skip via some kind of bootstrapping process? For a bpf object foo.bpf.c that wants to include linux/tcp.h and vmlinux.h, something like this seems possible; 1. run the preprocessor (gcc -E) over the BPF program to generate a bootstrap header foo.bpf.exclude_types.h consisting of all the types mentioned in it and associated includes, but not in vmlinux.h It would need to -D__VMLINUX_H__ to avoid including vmlinux.h definitions and -D__BPF_EXCLUDE_TYPE_BOOTSTRAP__ to skip the programs themselves, which would need a guard around them I think: #include <stddef.h> #include <stdbool.h> #include <vmlinux.h> #include <linux/tcp.h> #ifndef __BPF_EXCLUDE_TYPE_BOOTSTRAP__ //rest of program here #endif So as a result of this, we now have a single header file that contains all the types that non-vmlinux.h include files define. 2. now to generate vmlinux.h, pass foo.bpf.exclude_types.h into "bpftool btf dump" as an exclude header: bpftool btf dump --exclude /tmp/foo.bpf.types.h file /sys/kernel/btf/vmlinux format c > vmlinux.h bpftool would have to parse the exclude header for actual type definitions, spotting struct, enum and typedef definitions. This is likely made easier by running the preprocessor at least since formatting is probably quite uniform. vmlinux.h could simply emit forward declarations for types described both in vmlinux BTF and in the exclude file. So the build process would be - start with empty vmlinux.h - bootstrap a header consisting of the set of types to exclude via c preprocessor - generate new vmlinux.h based on above - build bpf program Build processes for BPF objects already has bootstrapping elements like this for generating vmlinux.h and skeletons, so while it's not perfect it might be a simpler approach. There may be problems with this I'm not seeing though? Thanks! Alan > After some discussion with Alexei and Yonghong I'd like to request > your comments regarding a somewhat brittle and partial solution to > this issue that relies on adding `#ifndef FOO_H ... #endif` guards in > the generated `vmlinux.h`. > > The basic idea > --- > > The goal of the patch set is to allow usage of header files from > `include/uapi` alongside `vmlinux.h` as follows: > > #include <uapi/linux/tcp.h> > #include "vmlinux.h" > > This goal is achieved by adding `#ifndef ... #endif` guards in > `vmlinux.h` around definitions that originate from the `include/uapi` > headers. The guards emitted match the guards used in the original > headers. E.g. as follows: > > include/uapi/linux/tcp.h: > > #ifndef _UAPI_LINUX_TCP_H > #define _UAPI_LINUX_TCP_H > ... > union tcp_word_hdr { > struct tcphdr hdr; > __be32 words[5]; > }; > ... > #endif /* _UAPI_LINUX_TCP_H */ > > vmlinux.h: > > ... > #ifndef _UAPI_LINUX_TCP_H > > union tcp_word_hdr { > struct tcphdr hdr; > __be32 words[5]; > }; > > #endif /* _UAPI_LINUX_TCP_H */ > ... > > To get to this state the following steps are necessary: > - "header guard" name should be identified for each header file; > - the correspondence between data type and it's header guard has to be > encoded in BTF; > - `bpftool` should be adjusted to emit `#ifndef FOO_H ... #endif` > brackets. > > It is not possible to identify header guard names for all uapi headers > basing only on the file name. However a simple script could devised to > identify the guards basing on the file name and it's content. Thus it > is possible to obtain the list of header names with corresponding > header guards. > > The correspondence between type and it's declaration file (header) is > available in DWARF as `DW_AT_decl_file` attribute. The > `DW_AT_decl_file` can be matched with the list of header guards > described above to obtain the header guard name for a specific type. > > The `pahole` generates BTF using DWARF. It is possible to modify > `pahole` to accept the header guards list as an additional parameter > and to encode the header guard names in BTF. > > Implementation details > --- > > Present patch-set implements these ideas as follows: > - A parameter `--header_guards_db` is added to `pahole`. If present it > points to a file with a list of `<header> <guard>` records. > - `pahole` uses DWARF `DW_AT_decl_file` value to lookup the header > guard for each type emitted to BTF. If header guard is present it is > encoded alongside the type. > - Header guards are encoded in BTF as `BTF_DECL_TAG` records with a > special prefix. The prefix "header_guard:" is added to a value of > such tags. (Here `BTF_DECL_TAG` is used to avoid BTF binary format > changes). > - A special script `infer_header_guards.pl` is added as a part of > kbuild, it can infer header guard names for each UAPI header basing > on the header content. > - This script is invoked from `link-vmlinux.sh` prior to BTF > generation during kernel build. The output of the script is saved to > a file, the file is passed to `pahole` as `--header_guards_db` > parameter. > - `libbpf` is modified to aggregate `BTF_DECL_TAG` records for each > type and to emit `#ifndef FOO_H ... #endif` brackets when > "header_guard:" tag is present for a type. > > Details for each patch in a set: > - libbpf: Deduplicate unambigous standalone forward declarations > - selftests/bpf: Tests for standalone forward BTF declarations deduplication > > There is a small number (63 for defconfig) of forward declarations > that are not de-duplicated with the main type declaration under > certain conditions. This hinders the header guard brackets > generation. This patch addresses this de-duplication issue. > > - libbpf: Support for BTF_DECL_TAG dump in C format > - selftests/bpf: Tests for BTF_DECL_TAG dump in C format > > Currently libbpf does not process BTF_DECL_TAG when btf is dumped in > C format. This patch adds a hash table matching btf type ids with a > list of decl tags to the struct btf_dump. > The `btf_dump_emit_decl_tags` is not necessary for the overall > patch-set to function but simplifies testing a bit. > > - libbpf: Header guards for selected data structures in vmlinux.h > - selftests/bpf: Tests for header guards printing in BTF dump > > Adds option `emit_header_guards` to `struct btf_dump_opts`. > When enabled the `btf_dump__dump_type` prints `#ifndef ... #endif` > brackets around types for which header guard information is present > in BTF. > > - bpftool: Enable header guards generation > > Unconditionally enables `emit_header_guards` for BTF dump in C format. > > - kbuild: Script to infer header guard values for uapi headers > - kbuild: Header guards for types from include/uapi/*.h in kernel BTF > > Adds `scripts/infer_header_guards.pl` and integrates it with > `link-vmlinux.sh`. > > - selftests/bpf: Script to verify uapi headers usage with vmlinux.h > > Adds a script `test_uapi_headers.py` that tests header guards with > vmlinux.h by compiling a simple C snippet. The snippet looks as > follows: > > #include <some_uapi_header.h> > #include "vmlinux.h" > > __attribute__((section("tc"), used)) > int syncookie_tc(struct __sk_buff *skb) { return 0; } > > The list of headers to test comes from > `tools/testing/selftests/bpf/good_uapi_headers.txt`. > > - selftests/bpf: Known good uapi headers for test_uapi_headers.py > > The list of uapi headers that could be included alongside vmlinux.h. > The headers are peeked from the following locations: > - <headers-export-dir>/linux/*.h > - <headers-export-dir>/linux/**/*.h > This choice of locations is somewhat arbitrary. > > - selftests/bpf: script for infer_header_guards.pl testing > > The test case for `scripts/infer_header_guards.pl`, verifies that > header guards can be inferred for all uapi headers. > > - There is also a patch for dwarves that adds `--header_guards_db` > option (see [2]). > > The `test_uapi_headers.py` is important as it demonstrates the > the necessary compiler flags: > > clang ... \ > -D__x86_64__ \ > -Xclang -fwchar-type=short \ > -Xclang -fno-signed-wchar \ > -I{exported_kernel_headers}/include/ \ > ... > > - `-fwchar-type=short` and `-fno-signed-wchar` had to be added because > BPF target uses `int` for `wchar_t` by default and this differs from > `vmlinux.h` definition of the type (at least for x86_64). > - `__x86_64__` had to be added for uapi headers that include > `stddef.h` (the one that is supplied my CLANG itself), in order to > define correct sizes for `size_t` and `ptrdiff_t`. > - The `{exported_kernel_headers}` stands for exported kernel headers > directory (the headers obtained by `make headers_install` or via > distribution package). > > When it works > --- > > The mechanics described above works for a significant number of UAPI > headers. For example, for the test case above I chose the headers from > the following locations: > - linux/*.h > - linux/**/*.h > There are 759 such headers and for 677 of them the test described > above passes. > > I excluded the headers from the following sub-directories as > potentially not interesting: > > asm rdma video xen > asm-generic misc scsi > drm mtd sound > > Thus saving some time for both discussion and CI but the choice is > somewhat arbitrary. If I run `test_uapi_headers.py --test '*'` (all > headers) test passes for 834 out of 972 headers. > > When it breaks > --- > > There several scenarios when this mechanics breaks. > Specifically I found the following cases: > - When uapi header includes some system header that conflicts with > vmlinux.h. > - When uapi header itself conflicts with vmlinux.h. > > Below are examples for both cases. > > Conflict with system headers > ---- > > The following uapi headers: > - linux/atmbr2684.h > - linux/bpfilter.h > - linux/gsmmux.h > - linux/icmp.h > - linux/if.h > - linux/if_arp.h > - linux/if_bonding.h > - linux/if_pppox.h > - linux/if_tunnel.h > - linux/ip6_tunnel.h > - linux/llc.h > - linux/mctp.h > - linux/mptcp.h > - linux/netdevice.h > - linux/netfilter/xt_RATEEST.h > - linux/netfilter/xt_hashlimit.h > - linux/netfilter/xt_physdev.h > - linux/netfilter/xt_rateest.h > - linux/netfilter_arp/arp_tables.h > - linux/netfilter_arp/arpt_mangle.h > - linux/netfilter_bridge.h > - linux/netfilter_bridge/ebtables.h > - linux/netfilter_ipv4/ip_tables.h > - linux/netfilter_ipv6/ip6_tables.h > - linux/route.h > - linux/wireless.h > > Include the following system header: > - /usr/include/sys/socket.h (all via linux/if.h) > > The sys/socket.h conflicts with vmlinux.h in: > - types: struct iovec, struct sockaddr, struct msghdr, ... > - constants: SOCK_STREAM, SOCK_DGRAM, ... > > However, only two types are actually used: > - struct sockaddr > - struct sockaddr_storage (used only in linux/mptcp.h) > > In 'vmlinux.h' this type originates from 'kernel/include/socket.h' > (non UAPI header), thus does not have a header guard. > > The only workaround that I see is to: > - define a stub sys/socket.h as follows: > > #ifndef __BPF_SOCKADDR__ > #define __BPF_SOCKADDR__ > > /* For __kernel_sa_family_t */ > #include <linux/socket.h> > > struct sockaddr { > __kernel_sa_family_t sa_family; > char sa_data[14]; > }; > > #endif > > - hardcode generation of __BPF_SOCKADDR__ bracket for > 'struct sockaddr' in vmlinux.h. > > Another possibility is to move the definition of 'struct sockaddr' > from 'kernel/include/socket.h' to 'kernel/include/uapi/linux/socket.h', > but I expect that this won't fly with the mainline as it might break > the programs that include both 'linux/socket.h' and 'sys/socket.h'. > > Conflict with vmlinux.h > ---- > > Uapi header: > - linux/signal.h > > Conflict with vmlinux.h in definition of 'struct sigaction'. > Defined in: > - vmlinux.h: kernel/include/linux/signal_types.h > - uapi: kernel/arch/x86/include/asm/signal.h > > Uapi headers: > - linux/tipc_sockets_diag.h > - linux/sock_diag.h > > Conflict with vmlinux.h in definition of 'SOCK_DESTROY'. > Defined in: > - vmlinux.h: kernel/include/net/sock.h > - uapi: kernel/include/uapi/linux/sock_diag.h > Constants seem to be unrelated. > > And so on... I have details for many other headers but omit those for > brevity. > > In conclusion > --- > > Except from the general feasibility I have a few questions: > - What UAPI headers are the candidates for such use? If there are some > interesting headers currently not working with this patch-set some > hacks have to be added (e.g. like with `linux/if.h`). > - Is it ok to encode header guards as special `BTF_DECL_TAG` or should > I change the BTF format a bit to save some bytes. > > Thanks, > Eduard > > > [1] https://reviews.llvm.org/D111307 > [clang] __attribute__ bpf_dominating_decl > [2] https://lore.kernel.org/dwarves/20221025220729.2293891-1-eddyz87@xxxxxxxxx/T/ > [RFC dwarves] pahole: Save header guard names when > --header_guards_db is passed > > Eduard Zingerman (12): > libbpf: Deduplicate unambigous standalone forward declarations > selftests/bpf: Tests for standalone forward BTF declarations > deduplication > libbpf: Support for BTF_DECL_TAG dump in C format > selftests/bpf: Tests for BTF_DECL_TAG dump in C format > libbpf: Header guards for selected data structures in vmlinux.h > selftests/bpf: Tests for header guards printing in BTF dump > bpftool: Enable header guards generation > kbuild: Script to infer header guard values for uapi headers > kbuild: Header guards for types from include/uapi/*.h in kernel BTF > selftests/bpf: Script to verify uapi headers usage with vmlinux.h > selftests/bpf: Known good uapi headers for test_uapi_headers.py > selftests/bpf: script for infer_header_guards.pl testing > > scripts/infer_header_guards.pl | 191 +++++ > scripts/link-vmlinux.sh | 13 +- > tools/bpf/bpftool/btf.c | 4 +- > tools/lib/bpf/btf.c | 178 ++++- > tools/lib/bpf/btf.h | 7 +- > tools/lib/bpf/btf_dump.c | 232 +++++- > .../selftests/bpf/good_uapi_headers.txt | 677 ++++++++++++++++++ > tools/testing/selftests/bpf/prog_tests/btf.c | 152 ++++ > .../selftests/bpf/prog_tests/btf_dump.c | 11 +- > .../bpf/progs/btf_dump_test_case_decl_tag.c | 39 + > .../progs/btf_dump_test_case_header_guards.c | 94 +++ > .../bpf/test_uapi_header_guards_infer.sh | 33 + > .../selftests/bpf/test_uapi_headers.py | 197 +++++ > 13 files changed, 1816 insertions(+), 12 deletions(-) > create mode 100755 scripts/infer_header_guards.pl > create mode 100644 tools/testing/selftests/bpf/good_uapi_headers.txt > create mode 100644 tools/testing/selftests/bpf/progs/btf_dump_test_case_decl_tag.c > create mode 100644 tools/testing/selftests/bpf/progs/btf_dump_test_case_header_guards.c > create mode 100755 tools/testing/selftests/bpf/test_uapi_header_guards_infer.sh > create mode 100755 tools/testing/selftests/bpf/test_uapi_headers.py >