Re: [PATCH v6 bpf-next 0/9] bpf: support resilient split BTF

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jun 13, 2024 at 2:50 AM Alan Maguire <alan.maguire@xxxxxxxxxx> wrote:
>
> Split BPF Type Format (BTF) provides huge advantages in that kernel
> modules only have to provide type information for types that they do not
> share with the core kernel; for core kernel types, split BTF refers to
> core kernel BTF type ids.  So for a STRUCT sk_buff, a module that
> uses that structure (or a pointer to it) simply needs to refer to the
> core kernel type id, saving the need to define the structure and its many
> dependents.  This cuts down on duplication and makes BTF as compact
> as possible.
>
> However, there is a downside.  This scheme requires the references from
> split BTF to base BTF to be valid not just at encoding time, but at use
> time (when the module is loaded).  Even a small change in kernel types
> can perturb the type ids in core kernel BTF, and - if the new reproducible
> BTF option is not used - pahole's parallel processing of compilation units
> can lead to different type ids for the same kernel if the BTF is
> regenerated.
>
> So we have a robustness problem for split BTF for cases where a module is
> not always compiled at the same time as the kernel.  This problem is
> particularly acute for distros which generally want module builders to be
> able to compile a module for the lifetime of a Linux stable-based release,
> and have it continue to be valid over the lifetime of that release, even
> as changes in data structures (and hence BTF types) accrue.  Today it's not
> possible to generate BTF for modules that works beyond the initial
> kernel it is compiled against - kernel bugfixes etc invalidate the split
> BTF references to vmlinux BTF, and BTF is no longer usable for the
> module.
>
> The goal of this series is to provide options to provide additional
> context for cases like this.  That context comes in the form of
> distilled base BTF; it stands in for the base BTF, and contains
> information about the types referenced from split BTF, but not their
> full descriptions.  The modified split BTF will refer to type ids in
> this .BTF.base section, and when the kernel loads such modules it
> will use that .BTF.base to map references from split BTF to the
> equivalent current vmlinux base BTF types.  Once this relocation
> process has succeeded, the module BTF available in /sys/kernel/btf
> will look exactly as if it was built with the current vmlinux;
> references to base types will be fixed up etc.
>
> A module builder - using this series along with the pahole changes -
> can then build a module with distilled base BTF via an out-of-tree
> module build, i.e.
>
> make -C . M=path/2/module
>
> The module will have a .BTF section (the split BTF) and a
> .BTF.base section.  The latter is small in size - distilled base
> BTF does not need full struct/union/enum information for named
> types for example.  For 2667 modules built with distilled base BTF,
> the average size observed was 1556 bytes (stddev 1563).  The overall
> size added to this 2667 modules was 5.3Mb.
>
> Note that for the in-tree modules, this approach is not needed as
> split and base BTF in the case of in-tree modules are always built
> and re-built together.
>
> The series first focuses on generating split BTF with distilled base
> BTF; then relocation support is added to allow split BTF with
> an associated distlled base to be relocated with a new base BTF.
>
> Next Eduard's patch allows BTF ELF parsing to work with both
> .BTF and .BTF.base sections; this ensures that bpftool will be
> able to dump BTF for a module with a .BTF.base section for example,
> or indeed dump relocated BTF where a module and a "-B vmlinux"
> is supplied.
>
> Then we add support to resolve_btfids to ignore base BTF - i.e.
> to avoid relocation - if a .BTF.base section is found.  This ensures
> the .BTF.ids section is populated with ids relative to the distilled
> base (these will be relocated as part of module load).
>
> Finally the series supports storage of .BTF.base data/size in modules
> and supports sharing of relocation code with the kernel to allow
> relocation of module BTF.  For the kernel, this relocation
> process happens at module load time, and we relocate split BTF
> references to point at types in the current vmlinux BTF.  As part of
> this, .BTF.ids references need to be mapped also.
>
> So concretely, what happens is
>
> - we generate split BTF in the .BTF section of a module that refers to
>   types in the .BTF.base section as base types; the latter are not full
>   type descriptions but provide information about the base type.  So
>   a STRUCT sk_buff would be represented as a FWD struct sk_buff in
>   distilled base BTF for example.
> - when the module is loaded, the split BTF is relocated with vmlinux
>   BTF; in the case of the FWD struct sk_buff, we find the STRUCT sk_buff
>   in vmlinux BTF and map all split BTF references to the distilled base
>   FWD sk_buff, replacing them with references to the vmlinux BTF
>   STRUCT sk_buff.
>
> A previous approach to this problem [1] utilized standalone BTF for such
> cases - where the BTF is not defined relative to base BTF so there is no
> relocation required.  The problem with that approach is that from
> the verifier perspective, some types are special, and having a custom
> representation of a core kernel type that did not necessarily match the
> current representation is not tenable.  So the approach taken here was
> to preserve the split BTF model while minimizing the representation of
> the context needed to relocate split and current vmlinux BTF.
>
> To generate distilled .BTF.base sections the associated dwarves
> patch (to be applied on the "next" branch there) is needed [3]
> Without it, things will still work but modules will not be built
> with a .BTF.base section.
>
> Changes since v5[4]:
>
> - Update search of distilled types to return the first occurrence
>   of a string (or a string+size pair); this allows us to iterate
>   over all matches in distilled base BTF (Andrii, patch 3)
> - Update to use BTF field iterators (Andrii, patches 1, 3 and 8)
> - Update tests to cover multiple match and associated error cases
>   (Eduard, patch 4)
> - Rename elf_sections_info to btf_elf_secs, remove use of
>   libbpf_get_error(), reset btf->owns_base when relocation
>   succeeds (Andrii, patch 5)
>
> Changes since v4[5]:
>
> - Moved embeddedness, duplicate name checks to relocation time
>   and record struct/union size for all distilled struct/unions
>   instead of using forwards.  This allows us to carry out
>   type compatibility checks based on the base BTF we want to
>   relocate with (Eduard, patches 1, 3)
> - Moved to using qsort() instead of qsort_r() as support for
>   qsort_r() appears to be missing in Android libc (Andrii, patch 3)
> - Sorting/searching now incorporates size matching depending
>   on BTF kind and embeddedness of struct/union (Eduard, Andrii,
>   patch 3)
> - Improved naming of various types during relocation to avoid
>   confusion (Andrii, patch 3)
> - Incorporated Eduard's patch (patch 5) which handles .BTF.base
>   sections internally in btf_parse_elf().  This makes ELF parsing
>   work with split BTF, split BTF with a distilled base, split
>   BTF with a distilled base _and_ base BTF (by relocating) etc.
>   Having this avoids the need for bpftool changes; it will work
>   as-is with .BTF.base sections (Eduard, patch 4)
> - Updated resolve_btfids to _not_ relocate BTF for modules
>   where a .BTF.base section is present; in that one case we
>   do not want to relocate BTF as the .BTF.ids section should
>   reflect ids in .BTF.base which will later be relocated on
>   module load (Eduard, Andrii, patch 5)
>
> Changes since v3[6]:
>
> - distill now checks for duplicate-named struct/unions and records
>   them as a sized struct/union to help identify which of the
>   multiple base BTF structs/unions it refers to (Eduard, patch 1)
> - added test support for multiple name handling (Eduard, patch 2)
> - simplified the string mapping when updating split BTF to use
>   base BTF instead of distilled base.  Since the only string
>   references split BTF can make to base BTF are the names of
>   the base types, create a string map from distilled string
>   offset -> base BTF string offset and update string offsets
>   by visiting all strings in split BTF; this saves having to
>   do costly searches of base BTF (Eduard, patch 7,10)
> - fixed bpftool manpage and indentation issues (Quentin, patch 11)
>
> Also explored Eduard's suggestion of doing an implicit fallback
> to checking for .BTF.base section in btf__parse() when it is
> called to get base BTF.  However while it is doable, it turned
> out to be difficult operationally.  Since fallback is implicit
> we do not know the source of the BTF - was it from .BTF or
> .BTF.base? In bpftool, we want to try first standalone BTF,
> then split, then split with distilled base.  Having a way
> to explicitly request .BTF.base via btf__parse_opts() fits
> that model better.
>
> Changes since v2[7]:
>
> - submitted patch to use --btf_features in Makefile.btf for pahole
>   v1.26 and later separately (Andrii).  That has landed in bpf-next
>   now.
> - distilled base now encodes ENUM64 as fwd ENUM (size 8), eliminating
>   the need for support for ENUM64 in btf__add_fwd (patch 1, Andrii)
> - moved to distilling only named types, augmenting split BTF with
>   associated reference types; this simplifies greatly the distilled
>   base BTF and the mapping operation between distilled and base
>   BTF when relocating (most of the series changes, Andrii)
> - relocation now iterates over base BTF, looking for matches based
>   on name in distilled BTF.  Distilled BTF is pre-sorted by name
>   (Andrii, patch 8)
> - removed most redundant compabitiliby checks aside from struct
>   size for base types/embedded structs and kind compatibility
>   (since we only match on name) (Andrii, patch 8)
> - btf__parse_opts() now replaces btf_parse() internally in libbpf
>   (Eduard, patch 3)
>
> Changes since RFC [8]:
>
> - updated terminology; we replace clunky "base reference" BTF with
>   distilling base BTF into a .BTF.base section. Similarly BTF
>   reconcilation becomes BTF relocation (Andrii, most patches)
> - add distilled base BTF by default for out-of-tree modules
>   (Alexei, patch 8)
> - distill algorithm updated to record size of embedded struct/union
>   by recording it as a 0-vlen STRUCT/UNION with size preserved
>   (Andrii, patch 2)
> - verify size match on relocation for such STRUCT/UNIONs (Andrii,
>   patch 9)
> - with embedded STRUCT/UNION recording size, we can have bpftool
>   dump a header representation using .BTF.base + .BTF sections
>   rather than special-casing and refusing to use "format c" for
>   that case (patch 5)
> - match enum with enum64 and vice versa (Andrii, patch 9)
> - ensure that resolve_btfids works with BTF without .BTF.base
>   section (patch 7)
> - update tests to cover embedded types, arrays and function
>   prototypes (patches 3, 12)
>
> [1] https://lore.kernel.org/bpf/20231112124834.388735-14-alan.maguire@xxxxxxxxxx/
> [2] https://lore.kernel.org/bpf/20240501175035.2476830-1-alan.maguire@xxxxxxxxxx/
> [3] https://lore.kernel.org/bpf/20240517102714.4072080-1-alan.maguire@xxxxxxxxxx/
> [4] https://lore.kernel.org/bpf/20240528122408.3154936-1-alan.maguire@xxxxxxxxxx/
> [5] https://lore.kernel.org/bpf/20240517102246.4070184-1-alan.maguire@xxxxxxxxxx/
> [6] https://lore.kernel.org/bpf/20240510103052.850012-1-alan.maguire@xxxxxxxxxx/
> [7] https://lore.kernel.org/bpf/20240424154806.3417662-1-alan.maguire@xxxxxxxxxx/
> [8] https://lore.kernel.org/bpf/20240322102455.98558-1-alan.maguire@xxxxxxxxxx/
>
> Alan Maguire (8):
>   libbpf: add btf__distill_base() creating split BTF with distilled base
>     BTF
>   selftests/bpf: test distilled base, split BTF generation
>   libbpf: split BTF relocation
>   selftests/bpf: extend distilled BTF tests to cover BTF relocation
>   resolve_btfids: handle presence of .BTF.base section

I've landed patches up to this point. But please see my comments and
address them in the follow up.

>   module, bpf: store BTF base pointer in struct module
>   libbpf,bpf: share BTF relocate-related code with kernel
>   kbuild,bpf: add module-specific pahole flags for distilled base BTF
>
> Eduard Zingerman (1):
>   libbpf: make btf_parse_elf process .BTF.base transparently
>
>  include/linux/btf.h                           |  64 ++
>  include/linux/module.h                        |   2 +
>  kernel/bpf/Makefile                           |  10 +-
>  kernel/bpf/btf.c                              | 168 +++--
>  kernel/module/main.c                          |   5 +-
>  scripts/Makefile.btf                          |   5 +
>  scripts/Makefile.modfinal                     |   2 +-
>  tools/bpf/resolve_btfids/main.c               |   8 +
>  tools/lib/bpf/Build                           |   2 +-
>  tools/lib/bpf/btf.c                           | 660 ++++++++++++------
>  tools/lib/bpf/btf.h                           |  36 +
>  tools/lib/bpf/btf_iter.c                      | 177 +++++
>  tools/lib/bpf/btf_relocate.c                  | 529 ++++++++++++++
>  tools/lib/bpf/libbpf.map                      |   2 +
>  tools/lib/bpf/libbpf_internal.h               |   3 +
>  .../selftests/bpf/prog_tests/btf_distill.c    | 552 +++++++++++++++
>  16 files changed, 1955 insertions(+), 270 deletions(-)
>  create mode 100644 tools/lib/bpf/btf_iter.c
>  create mode 100644 tools/lib/bpf/btf_relocate.c
>  create mode 100644 tools/testing/selftests/bpf/prog_tests/btf_distill.c
>
> --
> 2.31.1
>





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux