On 30/01/2024 23:05, Bryce Kahle wrote: > From: Bryce Kahle <bryce.kahle@xxxxxxxxxxxxx> > > Enables a user to generate minimized kernel module BTF. > > If an eBPF program probes a function within a kernel module or uses > types that come from a kernel module, split BTF is required. The split > module BTF contains only the BTF types that are unique to the module. > It will reference the base/vmlinux BTF types and always starts its type > IDs at X+1 where X is the largest type ID in the base BTF. > > Minimization allows a user to ship only the types necessary to do > relocations for the program(s) in the provided eBPF object file(s). A > minimized module BTF will still not contain vmlinux BTF types, so you > should always minimize the vmlinux file first, and then minimize the > kernel module file. > > Example: > > bpftool gen min_core_btf vmlinux.btf vm-min.btf prog.bpf.o > bpftool -B vm-min.btf gen min_core_btf mod.btf mod-min.btf prog.bpf.o This is great! I've been working on a somewhat related problem involving split BTF for modules, and I'm trying to figure out if there's overlap with what you've done here that can help in either direction. I'll try and describe what I'm doing. Sorry if this is a bit of a diversion, but I just want to check if there are potential ways your changes could facilitate other scenarios in the future. The problem I'm trying to tackle is to enable split BTF module generation to be more resilient to underlying kernel BTF changes; this would allow for example a module that is not built with the kernel to generate BTF and have it work even if small changes in vmlinux occur. Even a small change in BTF ids in base BTF is enough to invalidate the associated split BTF, so the question is how to make this a bit less brittle. This won't be needed for modules built along with the kernel, but more for cases like a package delivering a kernel module. The way this is done is similar to what you're doing - generating minimal base vmlinux BTF along with the module BTF. In my case however the minimization is not driven by CO-RE relocations; rather it is driven by only adding types that are referenced by module BTF and any other associated types needed. We end up with minimal base BTF that is carried along with the module BTF (in a .BTF.base_minimal section) and this minimal BTF will be used to later reconcile module BTF with the running kernel BTF when the module is loaded; it essentially provides the additional information needed to map to current vmlinux types. In this approach, minimal vmlinux BTF is generated via an additional option to pahole which adds an extra phase to BTF deduplication between module and kernel. Once we have found the candidate mappings for deduplication, we can look at all base BTF references from module BTF and recursively add associated types to the base minimal BTF. Finally we reparent the split BTF to this minimal base BTF. Experiments show most modules wind up with base minimal BTF of around 4000 types, so the minimization seems to work well. But it's complex. So what I've been trying to work out is if this dedup complexity can be eliminated with your changes, but from what I can see, the membership in the minimal base BTF in your case is driven by the CO-RE relocations used in the BPF program. Would there do you think be a future where we would look at doing base minimal BTF generation by other criteria (like references from the module BTF)? Thanks! Alan > v3->v4: > - address style nit about start_id initialization > - rename base to src_base_btf (base_btf is a global var) > - copy src_base_btf so new BTF is not modifying original vmlinux BTF > > Signed-off-by: Bryce Kahle <bryce.kahle@xxxxxxxxxxxxx> > --- > .../bpf/bpftool/Documentation/bpftool-gen.rst | 18 ++++++++++- > tools/bpf/bpftool/gen.c | 32 +++++++++++++++---- > 2 files changed, 42 insertions(+), 8 deletions(-) > > diff --git a/tools/bpf/bpftool/Documentation/bpftool-gen.rst b/tools/bpf/bpftool/Documentation/bpftool-gen.rst > index 5006e724d..e067d3b05 100644 > --- a/tools/bpf/bpftool/Documentation/bpftool-gen.rst > +++ b/tools/bpf/bpftool/Documentation/bpftool-gen.rst > @@ -16,7 +16,7 @@ SYNOPSIS > > **bpftool** [*OPTIONS*] **gen** *COMMAND* > > - *OPTIONS* := { |COMMON_OPTIONS| | { **-L** | **--use-loader** } } > + *OPTIONS* := { |COMMON_OPTIONS| | { **-B** | **--base-btf** } | { **-L** | **--use-loader** } } > > *COMMAND* := { **object** | **skeleton** | **help** } > > @@ -202,6 +202,14 @@ OPTIONS > ======= > .. include:: common_options.rst > > + -B, --base-btf *FILE* > + Pass a base BTF object. Base BTF objects are typically used > + with BTF objects for kernel modules. To avoid duplicating > + all kernel symbols required by modules, BTF objects for > + modules are "split", they are built incrementally on top of > + the kernel (vmlinux) BTF object. So the base BTF reference > + should usually point to the kernel BTF. > + > -L, --use-loader > For skeletons, generate a "light" skeleton (also known as "loader" > skeleton). A light skeleton contains a loader eBPF program. It does > @@ -444,3 +452,11 @@ ones given to min_core_btf. > obj = bpf_object__open_file("one.bpf.o", &opts); > > ... > + > +Kernel module BTF may also be minimized by using the -B option: > + > +**$ bpftool -B 5.4.0-smaller.btf gen min_core_btf 5.4.0-module.btf 5.4.0-module-smaller.btf one.bpf.o** > + > +A minimized module BTF will still not contain vmlinux BTF types, so you > +should always minimize the vmlinux file first, and then minimize the > +kernel module file. > diff --git a/tools/bpf/bpftool/gen.c b/tools/bpf/bpftool/gen.c > index ee3ce2b80..57691f766 100644 > --- a/tools/bpf/bpftool/gen.c > +++ b/tools/bpf/bpftool/gen.c > @@ -1630,6 +1630,7 @@ static int do_help(int argc, char **argv) > " %1$s %2$s help\n" > "\n" > " " HELP_SPEC_OPTIONS " |\n" > + " {-B|--base-btf} |\n" > " {-L|--use-loader} }\n" > "", > bin_name, "gen"); > @@ -1695,14 +1696,14 @@ btfgen_new_info(const char *targ_btf_path) > if (!info) > return NULL; > > - info->src_btf = btf__parse(targ_btf_path, NULL); > + info->src_btf = btf__parse_split(targ_btf_path, base_btf); > if (!info->src_btf) { > err = -errno; > p_err("failed parsing '%s' BTF file: %s", targ_btf_path, strerror(errno)); > goto err_out; > } > > - info->marked_btf = btf__parse(targ_btf_path, NULL); > + info->marked_btf = btf__parse_split(targ_btf_path, base_btf); > if (!info->marked_btf) { > err = -errno; > p_err("failed parsing '%s' BTF file: %s", targ_btf_path, strerror(errno)); > @@ -2139,12 +2140,29 @@ static int btfgen_remap_id(__u32 *type_id, void *ctx) > /* Generate BTF from relocation information previously recorded */ > static struct btf *btfgen_get_btf(struct btfgen_info *info) > { > - struct btf *btf_new = NULL; > + struct btf *btf_new = NULL, *src_base_btf_new = NULL; > unsigned int *ids = NULL; > + const struct btf *src_base_btf; > unsigned int i, n = btf__type_cnt(info->marked_btf); > - int err = 0; > + int start_id, err = 0; > + > + src_base_btf = btf__base_btf(info->src_btf); > + start_id = src_base_btf ? btf__type_cnt(src_base_btf) : 1; > > - btf_new = btf__new_empty(); > + /* clone BTF to sanitize a copy and leave the original intact */ > + if (src_base_btf) { > + const void *raw_data; > + __u32 sz; > + > + raw_data = btf__raw_data(src_base_btf, &sz); > + src_base_btf_new = btf__new(raw_data, sz); > + if (!src_base_btf_new) { > + err = -errno; > + goto err_out; > + } > + } > + > + btf_new = btf__new_empty_split(src_base_btf_new); > if (!btf_new) { > err = -errno; > goto err_out; > @@ -2157,7 +2175,7 @@ static struct btf *btfgen_get_btf(struct btfgen_info *info) > } > > /* first pass: add all marked types to btf_new and add their new ids to the ids map */ > - for (i = 1; i < n; i++) { > + for (i = start_id; i < n; i++) { > const struct btf_type *cloned_type, *type; > const char *name; > int new_id; > @@ -2213,7 +2231,7 @@ static struct btf *btfgen_get_btf(struct btfgen_info *info) > } > > /* second pass: fix up type ids */ > - for (i = 1; i < btf__type_cnt(btf_new); i++) { > + for (i = start_id; i < btf__type_cnt(btf_new); i++) { > struct btf_type *btf_type = (struct btf_type *) btf__type_by_id(btf_new, i); > > err = btf_type_visit_type_ids(btf_type, btfgen_remap_id, ids);