On Wed, 2022-10-26 at 12:10 +0100, Alan Maguire wrote: > On 25/10/2022 23:27, Eduard Zingerman wrote: > > Hi BPF community, > > > > AFAIK there is a long standing feature request to use kernel headers > > alongside `vmlinux.h` generated by `bpftool`. For example significant > > effort was put to add an attribute `bpf_dominating_decl` (see [1]) to > > clang, unfortunately this effort was stuck due to concerns regarding C > > language semantics. > > > > Thanks for the details! It's a tricky problem to solve. > > Before diving into this, can I ask if there's another way round this; > is there no way we could teach vmlinux.h generation which types to > skip via some kind of bootstrapping process? > > For a bpf object foo.bpf.c that wants to include linux/tcp.h and > vmlinux.h, something like this seems possible; > > 1. run the preprocessor (gcc -E) over the BPF program to generate > a bootstrap header foo.bpf.exclude_types.h consisting of all the types mentioned > in it and associated includes, but not in vmlinux.h It would need to -D__VMLINUX_H__ > to avoid including vmlinux.h definitions and -D__BPF_EXCLUDE_TYPE_BOOTSTRAP__ to > skip the programs themselves, which would need a guard around them I think: > > #include <stddef.h> > #include <stdbool.h> > #include <vmlinux.h> > #include <linux/tcp.h> > > #ifndef __BPF_EXCLUDE_TYPE_BOOTSTRAP__ > //rest of program here > #endif > > So as a result of this, we now have a single header file that contains all the types > that non-vmlinux.h include files define. > > 2. now to generate vmlinux.h, pass foo.bpf.exclude_types.h into "bpftool btf dump" as an > exclude header: > > bpftool btf dump --exclude /tmp/foo.bpf.types.h file /sys/kernel/btf/vmlinux format c > vmlinux.h > > bpftool would have to parse the exclude header for actual type definitions, spotting struct, > enum and typedef definitions. This is likely made easier by running the preprocessor > at least since formatting is probably quite uniform. vmlinux.h could simply emit forward declarations > for types described both in vmlinux BTF and in the exclude file. > > So the build process would be > > - start with empty vmlinux.h > - bootstrap a header consisting of the set of types to exclude via c preprocessor > - generate new vmlinux.h based on above > - build bpf program > > Build processes for BPF objects already has bootstrapping elements like > this for generating vmlinux.h and skeletons, so while it's not perfect it might > be a simpler approach. There may be problems with this I'm not seeing though? I like the tool based approach more but I heard there were some reservations about separate tool complicating the build process. - the tool does not require changes to the kbuild; - the tool is not limited to uapi headers; - the tool could imitate the dominating declarations attribute and address the issue with definitions miss-match between vmlinux.h and system headers (see my reply to Alexei in the parallel sub-thread). To support this point it would have to work in a somewhat different order: - read/pre-process the input file; - remove from it the definitions that are also present in kernel BTF; - prepend vmlinux.h to the beginning of the file. On the other hand I think that implementation would need a C parser / type analysis step. Thus it would be harder to implement it as a part of bpftool. But the prototype using something like [1] should be simple. Thanks, Eduard [1] https://github.com/inducer/pycparserext > > Thanks! > > Alan > > > After some discussion with Alexei and Yonghong I'd like to request > > your comments regarding a somewhat brittle and partial solution to > > this issue that relies on adding `#ifndef FOO_H ... #endif` guards in > > the generated `vmlinux.h`. > > > > The basic idea > > --- > > > > The goal of the patch set is to allow usage of header files from > > `include/uapi` alongside `vmlinux.h` as follows: > > > > #include <uapi/linux/tcp.h> > > #include "vmlinux.h" > > > > This goal is achieved by adding `#ifndef ... #endif` guards in > > `vmlinux.h` around definitions that originate from the `include/uapi` > > headers. The guards emitted match the guards used in the original > > headers. E.g. as follows: > > > > include/uapi/linux/tcp.h: > > > > #ifndef _UAPI_LINUX_TCP_H > > #define _UAPI_LINUX_TCP_H > > ... > > union tcp_word_hdr { > > struct tcphdr hdr; > > __be32 words[5]; > > }; > > ... > > #endif /* _UAPI_LINUX_TCP_H */ > > > > vmlinux.h: > > > > ... > > #ifndef _UAPI_LINUX_TCP_H > > > > union tcp_word_hdr { > > struct tcphdr hdr; > > __be32 words[5]; > > }; > > > > #endif /* _UAPI_LINUX_TCP_H */ > > ... > > > > To get to this state the following steps are necessary: > > - "header guard" name should be identified for each header file; > > - the correspondence between data type and it's header guard has to be > > encoded in BTF; > > - `bpftool` should be adjusted to emit `#ifndef FOO_H ... #endif` > > brackets. > > > > It is not possible to identify header guard names for all uapi headers > > basing only on the file name. However a simple script could devised to > > identify the guards basing on the file name and it's content. Thus it > > is possible to obtain the list of header names with corresponding > > header guards. > > > > The correspondence between type and it's declaration file (header) is > > available in DWARF as `DW_AT_decl_file` attribute. The > > `DW_AT_decl_file` can be matched with the list of header guards > > described above to obtain the header guard name for a specific type. > > > > The `pahole` generates BTF using DWARF. It is possible to modify > > `pahole` to accept the header guards list as an additional parameter > > and to encode the header guard names in BTF. > > > > Implementation details > > --- > > > > Present patch-set implements these ideas as follows: > > - A parameter `--header_guards_db` is added to `pahole`. If present it > > points to a file with a list of `<header> <guard>` records. > > - `pahole` uses DWARF `DW_AT_decl_file` value to lookup the header > > guard for each type emitted to BTF. If header guard is present it is > > encoded alongside the type. > > - Header guards are encoded in BTF as `BTF_DECL_TAG` records with a > > special prefix. The prefix "header_guard:" is added to a value of > > such tags. (Here `BTF_DECL_TAG` is used to avoid BTF binary format > > changes). > > - A special script `infer_header_guards.pl` is added as a part of > > kbuild, it can infer header guard names for each UAPI header basing > > on the header content. > > - This script is invoked from `link-vmlinux.sh` prior to BTF > > generation during kernel build. The output of the script is saved to > > a file, the file is passed to `pahole` as `--header_guards_db` > > parameter. > > - `libbpf` is modified to aggregate `BTF_DECL_TAG` records for each > > type and to emit `#ifndef FOO_H ... #endif` brackets when > > "header_guard:" tag is present for a type. > > > > Details for each patch in a set: > > - libbpf: Deduplicate unambigous standalone forward declarations > > - selftests/bpf: Tests for standalone forward BTF declarations deduplication > > > > There is a small number (63 for defconfig) of forward declarations > > that are not de-duplicated with the main type declaration under > > certain conditions. This hinders the header guard brackets > > generation. This patch addresses this de-duplication issue. > > > > - libbpf: Support for BTF_DECL_TAG dump in C format > > - selftests/bpf: Tests for BTF_DECL_TAG dump in C format > > > > Currently libbpf does not process BTF_DECL_TAG when btf is dumped in > > C format. This patch adds a hash table matching btf type ids with a > > list of decl tags to the struct btf_dump. > > The `btf_dump_emit_decl_tags` is not necessary for the overall > > patch-set to function but simplifies testing a bit. > > > > - libbpf: Header guards for selected data structures in vmlinux.h > > - selftests/bpf: Tests for header guards printing in BTF dump > > > > Adds option `emit_header_guards` to `struct btf_dump_opts`. > > When enabled the `btf_dump__dump_type` prints `#ifndef ... #endif` > > brackets around types for which header guard information is present > > in BTF. > > > > - bpftool: Enable header guards generation > > > > Unconditionally enables `emit_header_guards` for BTF dump in C format. > > > > - kbuild: Script to infer header guard values for uapi headers > > - kbuild: Header guards for types from include/uapi/*.h in kernel BTF > > > > Adds `scripts/infer_header_guards.pl` and integrates it with > > `link-vmlinux.sh`. > > > > - selftests/bpf: Script to verify uapi headers usage with vmlinux.h > > > > Adds a script `test_uapi_headers.py` that tests header guards with > > vmlinux.h by compiling a simple C snippet. The snippet looks as > > follows: > > > > #include <some_uapi_header.h> > > #include "vmlinux.h" > > > > __attribute__((section("tc"), used)) > > int syncookie_tc(struct __sk_buff *skb) { return 0; } > > > > The list of headers to test comes from > > `tools/testing/selftests/bpf/good_uapi_headers.txt`. > > > > - selftests/bpf: Known good uapi headers for test_uapi_headers.py > > > > The list of uapi headers that could be included alongside vmlinux.h. > > The headers are peeked from the following locations: > > - <headers-export-dir>/linux/*.h > > - <headers-export-dir>/linux/**/*.h > > This choice of locations is somewhat arbitrary. > > > > - selftests/bpf: script for infer_header_guards.pl testing > > > > The test case for `scripts/infer_header_guards.pl`, verifies that > > header guards can be inferred for all uapi headers. > > > > - There is also a patch for dwarves that adds `--header_guards_db` > > option (see [2]). > > > > The `test_uapi_headers.py` is important as it demonstrates the > > the necessary compiler flags: > > > > clang ... \ > > -D__x86_64__ \ > > -Xclang -fwchar-type=short \ > > -Xclang -fno-signed-wchar \ > > -I{exported_kernel_headers}/include/ \ > > ... > > > > - `-fwchar-type=short` and `-fno-signed-wchar` had to be added because > > BPF target uses `int` for `wchar_t` by default and this differs from > > `vmlinux.h` definition of the type (at least for x86_64). > > - `__x86_64__` had to be added for uapi headers that include > > `stddef.h` (the one that is supplied my CLANG itself), in order to > > define correct sizes for `size_t` and `ptrdiff_t`. > > - The `{exported_kernel_headers}` stands for exported kernel headers > > directory (the headers obtained by `make headers_install` or via > > distribution package). > > > > When it works > > --- > > > > The mechanics described above works for a significant number of UAPI > > headers. For example, for the test case above I chose the headers from > > the following locations: > > - linux/*.h > > - linux/**/*.h > > There are 759 such headers and for 677 of them the test described > > above passes. > > > > I excluded the headers from the following sub-directories as > > potentially not interesting: > > > > asm rdma video xen > > asm-generic misc scsi > > drm mtd sound > > > > Thus saving some time for both discussion and CI but the choice is > > somewhat arbitrary. If I run `test_uapi_headers.py --test '*'` (all > > headers) test passes for 834 out of 972 headers. > > > > When it breaks > > --- > > > > There several scenarios when this mechanics breaks. > > Specifically I found the following cases: > > - When uapi header includes some system header that conflicts with > > vmlinux.h. > > - When uapi header itself conflicts with vmlinux.h. > > > > Below are examples for both cases. > > > > Conflict with system headers > > ---- > > > > The following uapi headers: > > - linux/atmbr2684.h > > - linux/bpfilter.h > > - linux/gsmmux.h > > - linux/icmp.h > > - linux/if.h > > - linux/if_arp.h > > - linux/if_bonding.h > > - linux/if_pppox.h > > - linux/if_tunnel.h > > - linux/ip6_tunnel.h > > - linux/llc.h > > - linux/mctp.h > > - linux/mptcp.h > > - linux/netdevice.h > > - linux/netfilter/xt_RATEEST.h > > - linux/netfilter/xt_hashlimit.h > > - linux/netfilter/xt_physdev.h > > - linux/netfilter/xt_rateest.h > > - linux/netfilter_arp/arp_tables.h > > - linux/netfilter_arp/arpt_mangle.h > > - linux/netfilter_bridge.h > > - linux/netfilter_bridge/ebtables.h > > - linux/netfilter_ipv4/ip_tables.h > > - linux/netfilter_ipv6/ip6_tables.h > > - linux/route.h > > - linux/wireless.h > > > > Include the following system header: > > - /usr/include/sys/socket.h (all via linux/if.h) > > > > The sys/socket.h conflicts with vmlinux.h in: > > - types: struct iovec, struct sockaddr, struct msghdr, ... > > - constants: SOCK_STREAM, SOCK_DGRAM, ... > > > > However, only two types are actually used: > > - struct sockaddr > > - struct sockaddr_storage (used only in linux/mptcp.h) > > > > In 'vmlinux.h' this type originates from 'kernel/include/socket.h' > > (non UAPI header), thus does not have a header guard. > > > > The only workaround that I see is to: > > - define a stub sys/socket.h as follows: > > > > #ifndef __BPF_SOCKADDR__ > > #define __BPF_SOCKADDR__ > > > > /* For __kernel_sa_family_t */ > > #include <linux/socket.h> > > > > struct sockaddr { > > __kernel_sa_family_t sa_family; > > char sa_data[14]; > > }; > > > > #endif > > > > - hardcode generation of __BPF_SOCKADDR__ bracket for > > 'struct sockaddr' in vmlinux.h. > > > > Another possibility is to move the definition of 'struct sockaddr' > > from 'kernel/include/socket.h' to 'kernel/include/uapi/linux/socket.h', > > but I expect that this won't fly with the mainline as it might break > > the programs that include both 'linux/socket.h' and 'sys/socket.h'. > > > > Conflict with vmlinux.h > > ---- > > > > Uapi header: > > - linux/signal.h > > > > Conflict with vmlinux.h in definition of 'struct sigaction'. > > Defined in: > > - vmlinux.h: kernel/include/linux/signal_types.h > > - uapi: kernel/arch/x86/include/asm/signal.h > > > > Uapi headers: > > - linux/tipc_sockets_diag.h > > - linux/sock_diag.h > > > > Conflict with vmlinux.h in definition of 'SOCK_DESTROY'. > > Defined in: > > - vmlinux.h: kernel/include/net/sock.h > > - uapi: kernel/include/uapi/linux/sock_diag.h > > Constants seem to be unrelated. > > > > And so on... I have details for many other headers but omit those for > > brevity. > > > > In conclusion > > --- > > > > Except from the general feasibility I have a few questions: > > - What UAPI headers are the candidates for such use? If there are some > > interesting headers currently not working with this patch-set some > > hacks have to be added (e.g. like with `linux/if.h`). > > - Is it ok to encode header guards as special `BTF_DECL_TAG` or should > > I change the BTF format a bit to save some bytes. > > > > Thanks, > > Eduard > > > > > > [1] https://reviews.llvm.org/D111307 > > [clang] __attribute__ bpf_dominating_decl > > [2] https://lore.kernel.org/dwarves/20221025220729.2293891-1-eddyz87@xxxxxxxxx/T/ > > [RFC dwarves] pahole: Save header guard names when > > --header_guards_db is passed > > > > Eduard Zingerman (12): > > libbpf: Deduplicate unambigous standalone forward declarations > > selftests/bpf: Tests for standalone forward BTF declarations > > deduplication > > libbpf: Support for BTF_DECL_TAG dump in C format > > selftests/bpf: Tests for BTF_DECL_TAG dump in C format > > libbpf: Header guards for selected data structures in vmlinux.h > > selftests/bpf: Tests for header guards printing in BTF dump > > bpftool: Enable header guards generation > > kbuild: Script to infer header guard values for uapi headers > > kbuild: Header guards for types from include/uapi/*.h in kernel BTF > > selftests/bpf: Script to verify uapi headers usage with vmlinux.h > > selftests/bpf: Known good uapi headers for test_uapi_headers.py > > selftests/bpf: script for infer_header_guards.pl testing > > > > scripts/infer_header_guards.pl | 191 +++++ > > scripts/link-vmlinux.sh | 13 +- > > tools/bpf/bpftool/btf.c | 4 +- > > tools/lib/bpf/btf.c | 178 ++++- > > tools/lib/bpf/btf.h | 7 +- > > tools/lib/bpf/btf_dump.c | 232 +++++- > > .../selftests/bpf/good_uapi_headers.txt | 677 ++++++++++++++++++ > > tools/testing/selftests/bpf/prog_tests/btf.c | 152 ++++ > > .../selftests/bpf/prog_tests/btf_dump.c | 11 +- > > .../bpf/progs/btf_dump_test_case_decl_tag.c | 39 + > > .../progs/btf_dump_test_case_header_guards.c | 94 +++ > > .../bpf/test_uapi_header_guards_infer.sh | 33 + > > .../selftests/bpf/test_uapi_headers.py | 197 +++++ > > 13 files changed, 1816 insertions(+), 12 deletions(-) > > create mode 100755 scripts/infer_header_guards.pl > > create mode 100644 tools/testing/selftests/bpf/good_uapi_headers.txt > > create mode 100644 tools/testing/selftests/bpf/progs/btf_dump_test_case_decl_tag.c > > create mode 100644 tools/testing/selftests/bpf/progs/btf_dump_test_case_header_guards.c > > create mode 100755 tools/testing/selftests/bpf/test_uapi_header_guards_infer.sh > > create mode 100755 tools/testing/selftests/bpf/test_uapi_headers.py > >