Re: [PATCH bpf-next v4] bpftool: add support for split BTF to gen min_core_btf

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 30/01/2024 23:05, Bryce Kahle wrote:
> From: Bryce Kahle <bryce.kahle@xxxxxxxxxxxxx>
> 
> Enables a user to generate minimized kernel module BTF.
> 
> If an eBPF program probes a function within a kernel module or uses
> types that come from a kernel module, split BTF is required. The split
> module BTF contains only the BTF types that are unique to the module.
> It will reference the base/vmlinux BTF types and always starts its type
> IDs at X+1 where X is the largest type ID in the base BTF.
> 
> Minimization allows a user to ship only the types necessary to do
> relocations for the program(s) in the provided eBPF object file(s). A
> minimized module BTF will still not contain vmlinux BTF types, so you
> should always minimize the vmlinux file first, and then minimize the
> kernel module file.
> 
> Example:
> 
> bpftool gen min_core_btf vmlinux.btf vm-min.btf prog.bpf.o
> bpftool -B vm-min.btf gen min_core_btf mod.btf mod-min.btf prog.bpf.o

This is great! I've been working on a somewhat related problem involving
split BTF for modules, and I'm trying to figure out if there's overlap
with what you've done here that can help in either direction. I'll try
and describe what I'm doing. Sorry if this is a bit of a diversion,
but I just want to check if there are potential ways your changes could
facilitate other scenarios in the future.

The problem I'm trying to tackle is to enable split BTF module
generation to be more resilient to underlying kernel BTF changes;
this would allow for example a module that is not built with the kernel
to generate BTF and have it work even if small changes in vmlinux occur.
Even a small change in BTF ids in base BTF is enough to invalidate the
associated split BTF, so the question is how to make this a bit less
brittle. This won't be needed for modules built along with the kernel,
but more for cases like a package delivering a kernel module.

The way this is done is similar to what you're doing - generating
minimal base vmlinux BTF along with the module BTF. In my case however
the minimization is not driven by CO-RE relocations; rather it is driven
by only adding types that are referenced by module BTF and any other
associated types needed. We end up with minimal base BTF that is carried
along with the module BTF (in a .BTF.base_minimal section) and this
minimal BTF will be used to later reconcile module BTF with the running
kernel BTF when the module is loaded; it essentially provides the
additional information needed to map to current vmlinux types.

In this approach, minimal vmlinux BTF is generated via an additional
option to pahole which adds an extra phase to BTF deduplication between
module and kernel. Once we have found the candidate mappings for
deduplication, we can look at all base BTF references from module BTF
and recursively add associated types to the base minimal BTF. Finally we
reparent the split BTF to this minimal base BTF. Experiments show most
modules wind up with base minimal BTF of around 4000 types, so the
minimization seems to work well. But it's complex.

So what I've been trying to work out is if this dedup complexity can be
eliminated with your changes, but from what I can see, the membership in
the minimal base BTF in your case is driven by the CO-RE relocations
used in the BPF program. Would there do you think be a future where we
would look at doing base minimal BTF generation by other criteria (like
references from the module BTF)? Thanks!

Alan

> v3->v4:
> - address style nit about start_id initialization
> - rename base to src_base_btf (base_btf is a global var)
> - copy src_base_btf so new BTF is not modifying original vmlinux BTF
> 
> Signed-off-by: Bryce Kahle <bryce.kahle@xxxxxxxxxxxxx>
> ---
>  .../bpf/bpftool/Documentation/bpftool-gen.rst | 18 ++++++++++-
>  tools/bpf/bpftool/gen.c                       | 32 +++++++++++++++----
>  2 files changed, 42 insertions(+), 8 deletions(-)
> 
> diff --git a/tools/bpf/bpftool/Documentation/bpftool-gen.rst b/tools/bpf/bpftool/Documentation/bpftool-gen.rst
> index 5006e724d..e067d3b05 100644
> --- a/tools/bpf/bpftool/Documentation/bpftool-gen.rst
> +++ b/tools/bpf/bpftool/Documentation/bpftool-gen.rst
> @@ -16,7 +16,7 @@ SYNOPSIS
>  
>  	**bpftool** [*OPTIONS*] **gen** *COMMAND*
>  
> -	*OPTIONS* := { |COMMON_OPTIONS| | { **-L** | **--use-loader** } }
> +	*OPTIONS* := { |COMMON_OPTIONS| | { **-B** | **--base-btf** } | { **-L** | **--use-loader** } }
>  
>  	*COMMAND* := { **object** | **skeleton** | **help** }
>  
> @@ -202,6 +202,14 @@ OPTIONS
>  =======
>  	.. include:: common_options.rst
>  
> +	-B, --base-btf *FILE*
> +		  Pass a base BTF object. Base BTF objects are typically used
> +		  with BTF objects for kernel modules. To avoid duplicating
> +		  all kernel symbols required by modules, BTF objects for
> +		  modules are "split", they are built incrementally on top of
> +		  the kernel (vmlinux) BTF object. So the base BTF reference
> +		  should usually point to the kernel BTF.
> +
>  	-L, --use-loader
>  		  For skeletons, generate a "light" skeleton (also known as "loader"
>  		  skeleton). A light skeleton contains a loader eBPF program. It does
> @@ -444,3 +452,11 @@ ones given to min_core_btf.
>    obj = bpf_object__open_file("one.bpf.o", &opts);
>  
>    ...
> +
> +Kernel module BTF may also be minimized by using the -B option:
> +
> +**$ bpftool -B 5.4.0-smaller.btf gen min_core_btf 5.4.0-module.btf 5.4.0-module-smaller.btf one.bpf.o**
> +
> +A minimized module BTF will still not contain vmlinux BTF types, so you
> +should always minimize the vmlinux file first, and then minimize the
> +kernel module file.
> diff --git a/tools/bpf/bpftool/gen.c b/tools/bpf/bpftool/gen.c
> index ee3ce2b80..57691f766 100644
> --- a/tools/bpf/bpftool/gen.c
> +++ b/tools/bpf/bpftool/gen.c
> @@ -1630,6 +1630,7 @@ static int do_help(int argc, char **argv)
>  		"       %1$s %2$s help\n"
>  		"\n"
>  		"       " HELP_SPEC_OPTIONS " |\n"
> +		"                    {-B|--base-btf} |\n"
>  		"                    {-L|--use-loader} }\n"
>  		"",
>  		bin_name, "gen");
> @@ -1695,14 +1696,14 @@ btfgen_new_info(const char *targ_btf_path)
>  	if (!info)
>  		return NULL;
>  
> -	info->src_btf = btf__parse(targ_btf_path, NULL);
> +	info->src_btf = btf__parse_split(targ_btf_path, base_btf);
>  	if (!info->src_btf) {
>  		err = -errno;
>  		p_err("failed parsing '%s' BTF file: %s", targ_btf_path, strerror(errno));
>  		goto err_out;
>  	}
>  
> -	info->marked_btf = btf__parse(targ_btf_path, NULL);
> +	info->marked_btf = btf__parse_split(targ_btf_path, base_btf);
>  	if (!info->marked_btf) {
>  		err = -errno;
>  		p_err("failed parsing '%s' BTF file: %s", targ_btf_path, strerror(errno));
> @@ -2139,12 +2140,29 @@ static int btfgen_remap_id(__u32 *type_id, void *ctx)
>  /* Generate BTF from relocation information previously recorded */
>  static struct btf *btfgen_get_btf(struct btfgen_info *info)
>  {
> -	struct btf *btf_new = NULL;
> +	struct btf *btf_new = NULL, *src_base_btf_new = NULL;
>  	unsigned int *ids = NULL;
> +	const struct btf *src_base_btf;
>  	unsigned int i, n = btf__type_cnt(info->marked_btf);
> -	int err = 0;
> +	int start_id, err = 0;
> +
> +	src_base_btf = btf__base_btf(info->src_btf);
> +	start_id = src_base_btf ? btf__type_cnt(src_base_btf) : 1;
>  
> -	btf_new = btf__new_empty();
> +	/* clone BTF to sanitize a copy and leave the original intact */
> +	if (src_base_btf) {
> +		const void *raw_data;
> +		__u32 sz;
> +
> +		raw_data = btf__raw_data(src_base_btf, &sz);
> +		src_base_btf_new = btf__new(raw_data, sz);
> +		if (!src_base_btf_new) {
> +			err = -errno;
> +			goto err_out;
> +		}
> +	}
> +
> +	btf_new = btf__new_empty_split(src_base_btf_new);
>  	if (!btf_new) {
>  		err = -errno;
>  		goto err_out;
> @@ -2157,7 +2175,7 @@ static struct btf *btfgen_get_btf(struct btfgen_info *info)
>  	}
>  
>  	/* first pass: add all marked types to btf_new and add their new ids to the ids map */
> -	for (i = 1; i < n; i++) {
> +	for (i = start_id; i < n; i++) {
>  		const struct btf_type *cloned_type, *type;
>  		const char *name;
>  		int new_id;
> @@ -2213,7 +2231,7 @@ static struct btf *btfgen_get_btf(struct btfgen_info *info)
>  	}
>  
>  	/* second pass: fix up type ids */
> -	for (i = 1; i < btf__type_cnt(btf_new); i++) {
> +	for (i = start_id; i < btf__type_cnt(btf_new); i++) {
>  		struct btf_type *btf_type = (struct btf_type *) btf__type_by_id(btf_new, i);
>  
>  		err = btf_type_visit_type_ids(btf_type, btfgen_remap_id, ids);




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux