Re: [RFC bpf-next 00/12] Use uapi kernel headers with vmlinux.h

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 2022-10-26 at 12:10 +0100, Alan Maguire wrote:
> On 25/10/2022 23:27, Eduard Zingerman wrote:
> > Hi BPF community,
> > 
> > AFAIK there is a long standing feature request to use kernel headers
> > alongside `vmlinux.h` generated by `bpftool`. For example significant
> > effort was put to add an attribute `bpf_dominating_decl` (see [1]) to
> > clang, unfortunately this effort was stuck due to concerns regarding C
> > language semantics.
> > 
> 
> Thanks for the details! It's a tricky problem to solve.
> 
> Before diving into this, can I ask if there's another way round this;
> is there no way we could teach vmlinux.h generation which types to
> skip via some kind of bootstrapping process?
> 
> For a bpf object foo.bpf.c that wants to include linux/tcp.h and 
> vmlinux.h, something like this seems possible;
> 
> 1. run the preprocessor (gcc -E) over the BPF program to generate 
> a bootstrap header foo.bpf.exclude_types.h consisting of all the types mentioned
> in it and associated includes, but not in vmlinux.h  It would need to -D__VMLINUX_H__ 
> to avoid including vmlinux.h definitions and -D__BPF_EXCLUDE_TYPE_BOOTSTRAP__ to 
> skip the programs themselves, which would need a guard around them I think:
> 
> #include <stddef.h>
> #include <stdbool.h>
> #include <vmlinux.h>
> #include <linux/tcp.h>
> 
> #ifndef __BPF_EXCLUDE_TYPE_BOOTSTRAP__
> //rest of program here
> #endif
> 
> So as a result of this, we now have a single header file that contains all the types
> that non-vmlinux.h include files define.
> 
> 2. now to generate vmlinux.h, pass foo.bpf.exclude_types.h into "bpftool btf dump" as an
> exclude header:
> 
> bpftool btf dump --exclude /tmp/foo.bpf.types.h file /sys/kernel/btf/vmlinux format c > vmlinux.h
> 
> bpftool would have to parse the exclude header for actual type definitions, spotting struct,
> enum and typedef definitions. This is likely made easier by running the preprocessor
> at least since formatting is probably quite uniform. vmlinux.h could simply emit forward declarations
> for types described both in vmlinux BTF and in the exclude file.
> 
> So the build process would be
> 
> - start with empty vmlinux.h
> - bootstrap a header consisting of the set of types to exclude via c preprocessor
> - generate new vmlinux.h based on above
> - build bpf program
> 
> Build processes for BPF objects already has bootstrapping elements like
> this for generating vmlinux.h and skeletons, so while it's not perfect it might
> be a simpler approach. There may be problems with this I'm not seeing though?

I like the tool based approach more but I heard there were some
reservations about separate tool complicating the build process.
- the tool does not require changes to the kbuild;
- the tool is not limited to uapi headers;
- the tool could imitate the dominating declarations attribute and
  address the issue with definitions miss-match between vmlinux.h and
  system headers (see my reply to Alexei in the parallel sub-thread).
  To support this point it would have to work in a somewhat different
  order:
  - read/pre-process the input file;
  - remove from it the definitions that are also present in kernel BTF;
  - prepend vmlinux.h to the beginning of the file.

On the other hand I think that implementation would need a C parser /
type analysis step. Thus it would be harder to implement it as a part
of bpftool. But the prototype using something like [1] should be simple.

Thanks,
Eduard


[1] https://github.com/inducer/pycparserext

> 
> Thanks!
> 
> Alan
> 
> > After some discussion with Alexei and Yonghong I'd like to request
> > your comments regarding a somewhat brittle and partial solution to
> > this issue that relies on adding `#ifndef FOO_H ... #endif` guards in
> > the generated `vmlinux.h`.
> > 
> > The basic idea
> > ---
> > 
> > The goal of the patch set is to allow usage of header files from
> > `include/uapi` alongside `vmlinux.h` as follows:
> > 
> >   #include <uapi/linux/tcp.h>
> >   #include "vmlinux.h"
> > 
> > This goal is achieved by adding `#ifndef ... #endif` guards in
> > `vmlinux.h` around definitions that originate from the `include/uapi`
> > headers. The guards emitted match the guards used in the original
> > headers. E.g. as follows:
> > 
> > include/uapi/linux/tcp.h:
> > 
> >   #ifndef _UAPI_LINUX_TCP_H
> >   #define _UAPI_LINUX_TCP_H
> >   ...
> >   union tcp_word_hdr {
> > 	struct tcphdr hdr;
> > 	__be32        words[5];
> >   };
> >   ...
> >   #endif /* _UAPI_LINUX_TCP_H */
> > 
> > vmlinux.h:
> > 
> >   ...
> >   #ifndef _UAPI_LINUX_TCP_H
> > 
> >   union tcp_word_hdr {
> > 	struct tcphdr hdr;
> > 	__be32 words[5];
> >   };
> > 
> >   #endif /* _UAPI_LINUX_TCP_H */
> >   ...
> > 
> > To get to this state the following steps are necessary:
> > - "header guard" name should be identified for each header file;
> > - the correspondence between data type and it's header guard has to be
> >   encoded in BTF;
> > - `bpftool` should be adjusted to emit `#ifndef FOO_H ... #endif`
> >   brackets.
> > 
> > It is not possible to identify header guard names for all uapi headers
> > basing only on the file name. However a simple script could devised to
> > identify the guards basing on the file name and it's content. Thus it
> > is possible to obtain the list of header names with corresponding
> > header guards.
> > 
> > The correspondence between type and it's declaration file (header) is
> > available in DWARF as `DW_AT_decl_file` attribute. The
> > `DW_AT_decl_file` can be matched with the list of header guards
> > described above to obtain the header guard name for a specific type.
> > 
> > The `pahole` generates BTF using DWARF. It is possible to modify
> > `pahole` to accept the header guards list as an additional parameter
> > and to encode the header guard names in BTF.
> > 
> > Implementation details
> > ---
> > 
> > Present patch-set implements these ideas as follows:
> > - A parameter `--header_guards_db` is added to `pahole`. If present it
> >   points to a file with a list of `<header> <guard>` records.
> > - `pahole` uses DWARF `DW_AT_decl_file` value to lookup the header
> >   guard for each type emitted to BTF. If header guard is present it is
> >   encoded alongside the type.
> > - Header guards are encoded in BTF as `BTF_DECL_TAG` records with a
> >   special prefix. The prefix "header_guard:" is added to a value of
> >   such tags. (Here `BTF_DECL_TAG` is used to avoid BTF binary format
> >   changes).
> > - A special script `infer_header_guards.pl` is added as a part of
> >   kbuild, it can infer header guard names for each UAPI header basing
> >   on the header content.
> > - This script is invoked from `link-vmlinux.sh` prior to BTF
> >   generation during kernel build. The output of the script is saved to
> >   a file, the file is passed to `pahole` as `--header_guards_db`
> >   parameter.
> > - `libbpf` is modified to aggregate `BTF_DECL_TAG` records for each
> >   type and to emit `#ifndef FOO_H ... #endif` brackets when
> >   "header_guard:" tag is present for a type.
> > 
> > Details for each patch in a set:
> > - libbpf: Deduplicate unambigous standalone forward declarations
> > - selftests/bpf: Tests for standalone forward BTF declarations deduplication
> > 
> >   There is a small number (63 for defconfig) of forward declarations
> >   that are not de-duplicated with the main type declaration under
> >   certain conditions. This hinders the header guard brackets
> >   generation. This patch addresses this de-duplication issue.
> > 
> > - libbpf: Support for BTF_DECL_TAG dump in C format
> > - selftests/bpf: Tests for BTF_DECL_TAG dump in C format
> > 
> >   Currently libbpf does not process BTF_DECL_TAG when btf is dumped in
> >   C format. This patch adds a hash table matching btf type ids with a
> >   list of decl tags to the struct btf_dump.
> >   The `btf_dump_emit_decl_tags` is not necessary for the overall
> >   patch-set to function but simplifies testing a bit.
> > 
> > - libbpf: Header guards for selected data structures in vmlinux.h
> > - selftests/bpf: Tests for header guards printing in BTF dump
> > 
> >   Adds option `emit_header_guards` to `struct btf_dump_opts`.
> >   When enabled the `btf_dump__dump_type` prints `#ifndef ... #endif`
> >   brackets around types for which header guard information is present
> >   in BTF.
> > 
> > - bpftool: Enable header guards generation
> > 
> >   Unconditionally enables `emit_header_guards` for BTF dump in C format.
> > 
> > - kbuild: Script to infer header guard values for uapi headers
> > - kbuild: Header guards for types from include/uapi/*.h in kernel BTF
> > 
> >   Adds `scripts/infer_header_guards.pl` and integrates it with
> >   `link-vmlinux.sh`.
> > 
> > - selftests/bpf: Script to verify uapi headers usage with vmlinux.h
> > 
> >   Adds a script `test_uapi_headers.py` that tests header guards with
> >   vmlinux.h by compiling a simple C snippet. The snippet looks as
> >   follows:
> >   
> >     #include <some_uapi_header.h>
> >     #include "vmlinux.h"
> >   
> >     __attribute__((section("tc"), used))
> >     int syncookie_tc(struct __sk_buff *skb) { return 0; }
> >   
> >   The list of headers to test comes from
> >   `tools/testing/selftests/bpf/good_uapi_headers.txt`.
> > 
> > - selftests/bpf: Known good uapi headers for test_uapi_headers.py
> > 
> >   The list of uapi headers that could be included alongside vmlinux.h.
> >   The headers are peeked from the following locations:
> >   - <headers-export-dir>/linux/*.h
> >   - <headers-export-dir>/linux/**/*.h
> >   This choice of locations is somewhat arbitrary.
> > 
> > - selftests/bpf: script for infer_header_guards.pl testing
> > 
> >   The test case for `scripts/infer_header_guards.pl`, verifies that
> >   header guards can be inferred for all uapi headers.
> > 
> > - There is also a patch for dwarves that adds `--header_guards_db`
> >   option (see [2]).
> > 
> > The `test_uapi_headers.py` is important as it demonstrates the
> > the necessary compiler flags:
> > 
> > clang ...                                  \
> >       -D__x86_64__                         \
> >       -Xclang -fwchar-type=short           \
> >       -Xclang -fno-signed-wchar            \
> >       -I{exported_kernel_headers}/include/ \
> >       ...
> > 
> > - `-fwchar-type=short` and `-fno-signed-wchar` had to be added because
> >   BPF target uses `int` for `wchar_t` by default and this differs from
> >   `vmlinux.h` definition of the type (at least for x86_64).
> > - `__x86_64__` had to be added for uapi headers that include
> >   `stddef.h` (the one that is supplied my CLANG itself), in order to
> >   define correct sizes for `size_t` and `ptrdiff_t`.
> > - The `{exported_kernel_headers}` stands for exported kernel headers
> >   directory (the headers obtained by `make headers_install` or via
> >   distribution package).
> > 
> > When it works
> > ---
> > 
> > The mechanics described above works for a significant number of UAPI
> > headers. For example, for the test case above I chose the headers from
> > the following locations:
> > - linux/*.h
> > - linux/**/*.h
> > There are 759 such headers and for 677 of them the test described
> > above passes.
> > 
> > I excluded the headers from the following sub-directories as
> > potentially not interesting:
> > 
> >   asm          rdma   video xen
> >   asm-generic  misc   scsi
> >   drm          mtd    sound
> > 
> > Thus saving some time for both discussion and CI but the choice is
> > somewhat arbitrary. If I run `test_uapi_headers.py --test '*'` (all
> > headers) test passes for 834 out of 972 headers.
> > 
> > When it breaks
> > ---
> > 
> > There several scenarios when this mechanics breaks.
> > Specifically I found the following cases:
> > - When uapi header includes some system header that conflicts with
> >   vmlinux.h.
> > - When uapi header itself conflicts with vmlinux.h.
> > 
> > Below are examples for both cases.
> > 
> > Conflict with system headers
> > ----
> > 
> > The following uapi headers:
> > - linux/atmbr2684.h
> > - linux/bpfilter.h
> > - linux/gsmmux.h
> > - linux/icmp.h
> > - linux/if.h
> > - linux/if_arp.h
> > - linux/if_bonding.h
> > - linux/if_pppox.h
> > - linux/if_tunnel.h
> > - linux/ip6_tunnel.h
> > - linux/llc.h
> > - linux/mctp.h
> > - linux/mptcp.h
> > - linux/netdevice.h
> > - linux/netfilter/xt_RATEEST.h
> > - linux/netfilter/xt_hashlimit.h
> > - linux/netfilter/xt_physdev.h
> > - linux/netfilter/xt_rateest.h
> > - linux/netfilter_arp/arp_tables.h
> > - linux/netfilter_arp/arpt_mangle.h
> > - linux/netfilter_bridge.h
> > - linux/netfilter_bridge/ebtables.h
> > - linux/netfilter_ipv4/ip_tables.h
> > - linux/netfilter_ipv6/ip6_tables.h
> > - linux/route.h
> > - linux/wireless.h
> > 
> > Include the following system header:
> > - /usr/include/sys/socket.h (all via linux/if.h)
> > 
> > The sys/socket.h conflicts with vmlinux.h in:
> > - types: struct iovec, struct sockaddr, struct msghdr, ...
> > - constants: SOCK_STREAM, SOCK_DGRAM, ...
> > 
> > However, only two types are actually used:
> > - struct sockaddr
> > - struct sockaddr_storage (used only in linux/mptcp.h)
> > 
> > In 'vmlinux.h' this type originates from 'kernel/include/socket.h'
> > (non UAPI header), thus does not have a header guard.
> > 
> > The only workaround that I see is to:
> > - define a stub sys/socket.h as follows:
> > 
> >     #ifndef __BPF_SOCKADDR__
> >     #define __BPF_SOCKADDR__
> >     
> >     /* For __kernel_sa_family_t */
> >     #include <linux/socket.h>
> >     
> >     struct sockaddr {
> >         __kernel_sa_family_t sa_family;
> >         char sa_data[14];
> >     };
> >     
> >     #endif
> > 
> > - hardcode generation of __BPF_SOCKADDR__ bracket for
> >   'struct sockaddr' in vmlinux.h.
> > 
> > Another possibility is to move the definition of 'struct sockaddr'
> > from 'kernel/include/socket.h' to 'kernel/include/uapi/linux/socket.h',
> > but I expect that this won't fly with the mainline as it might break
> > the programs that include both 'linux/socket.h' and 'sys/socket.h'.
> > 
> > Conflict with vmlinux.h
> > ----
> > 
> > Uapi header:
> > - linux/signal.h
> > 
> > Conflict with vmlinux.h in definition of 'struct sigaction'.
> > Defined in:
> > - vmlinux.h: kernel/include/linux/signal_types.h
> > - uapi:      kernel/arch/x86/include/asm/signal.h
> > 
> > Uapi headers:
> > - linux/tipc_sockets_diag.h
> > - linux/sock_diag.h
> > 
> > Conflict with vmlinux.h in definition of 'SOCK_DESTROY'.
> > Defined in:
> > - vmlinux.h: kernel/include/net/sock.h
> > - uapi:      kernel/include/uapi/linux/sock_diag.h
> > Constants seem to be unrelated.
> > 
> > And so on... I have details for many other headers but omit those for
> > brevity.
> > 
> > In conclusion
> > ---
> > 
> > Except from the general feasibility I have a few questions:
> > - What UAPI headers are the candidates for such use? If there are some
> >   interesting headers currently not working with this patch-set some
> >   hacks have to be added (e.g. like with `linux/if.h`).
> > - Is it ok to encode header guards as special `BTF_DECL_TAG` or should
> >   I change the BTF format a bit to save some bytes.
> > 
> > Thanks,
> > Eduard
> > 
> > 
> > [1] https://reviews.llvm.org/D111307
> >     [clang] __attribute__ bpf_dominating_decl
> > [2] https://lore.kernel.org/dwarves/20221025220729.2293891-1-eddyz87@xxxxxxxxx/T/
> >     [RFC dwarves] pahole: Save header guard names when
> >                   --header_guards_db is passed
> > 
> > Eduard Zingerman (12):
> >   libbpf: Deduplicate unambigous standalone forward declarations
> >   selftests/bpf: Tests for standalone forward BTF declarations
> >     deduplication
> >   libbpf: Support for BTF_DECL_TAG dump in C format
> >   selftests/bpf: Tests for BTF_DECL_TAG dump in C format
> >   libbpf: Header guards for selected data structures in vmlinux.h
> >   selftests/bpf: Tests for header guards printing in BTF dump
> >   bpftool: Enable header guards generation
> >   kbuild: Script to infer header guard values for uapi headers
> >   kbuild: Header guards for types from include/uapi/*.h in kernel BTF
> >   selftests/bpf: Script to verify uapi headers usage with vmlinux.h
> >   selftests/bpf: Known good uapi headers for test_uapi_headers.py
> >   selftests/bpf: script for infer_header_guards.pl testing
> > 
> >  scripts/infer_header_guards.pl                | 191 +++++
> >  scripts/link-vmlinux.sh                       |  13 +-
> >  tools/bpf/bpftool/btf.c                       |   4 +-
> >  tools/lib/bpf/btf.c                           | 178 ++++-
> >  tools/lib/bpf/btf.h                           |   7 +-
> >  tools/lib/bpf/btf_dump.c                      | 232 +++++-
> >  .../selftests/bpf/good_uapi_headers.txt       | 677 ++++++++++++++++++
> >  tools/testing/selftests/bpf/prog_tests/btf.c  | 152 ++++
> >  .../selftests/bpf/prog_tests/btf_dump.c       |  11 +-
> >  .../bpf/progs/btf_dump_test_case_decl_tag.c   |  39 +
> >  .../progs/btf_dump_test_case_header_guards.c  |  94 +++
> >  .../bpf/test_uapi_header_guards_infer.sh      |  33 +
> >  .../selftests/bpf/test_uapi_headers.py        | 197 +++++
> >  13 files changed, 1816 insertions(+), 12 deletions(-)
> >  create mode 100755 scripts/infer_header_guards.pl
> >  create mode 100644 tools/testing/selftests/bpf/good_uapi_headers.txt
> >  create mode 100644 tools/testing/selftests/bpf/progs/btf_dump_test_case_decl_tag.c
> >  create mode 100644 tools/testing/selftests/bpf/progs/btf_dump_test_case_header_guards.c
> >  create mode 100755 tools/testing/selftests/bpf/test_uapi_header_guards_infer.sh
> >  create mode 100755 tools/testing/selftests/bpf/test_uapi_headers.py
> > 





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux