Re: [PATCH bpf-next v4] bpftool: add support for split BTF to gen min_core_btf

Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> · Fri, 9 Feb 2024 11:58:09 -0800

On Thu, Feb 8, 2024 at 3:01 PM Alan Maguire <alan.maguire@xxxxxxxxxx> wrote:
>
> On 08/02/2024 01:56, Yonghong Song wrote:
> >
> > On 2/7/24 4:30 PM, Andrii Nakryiko wrote:
> >> On Wed, Feb 7, 2024 at 2:38 PM Yonghong Song <yonghong.song@xxxxxxxxx>
> >> wrote:
> >>>
> >>> On 2/7/24 10:51 AM, Bryce Kahle wrote:
> >>>> On Mon, Feb 5, 2024 at 10:21 AM Andrii Nakryiko
> >>>> <andrii.nakryiko@xxxxxxxxx> wrote:
> >>>>> 3) btf__dedup() will deduplicate everything, so that only unique type
> >>>>> definitions remain.
> >>>>>
> >>> A random thought about another way.
> >>> At module side, we keep
> >>>     - module btf
> >>>     - another section (e.g. .BTF.extra) to keep minimum kernel-side
> >>>       types which directly used by module btf
> >>>
>
> Yep, that's exactly the approach I was pursuing; an extra section
> containing those types (I was calling it .BTF.base_minimal).
>
> >>>     for example, module btf has
> >>>       struct foo {
> >>>         struct task_struct *t;
> >>>       }
> >>>       module btf encoding will have id, say 20,
> >>>       for 'struct task_struct' which is at that time
> >>>       the id in linux kernel.
> >>>     Then the module .BTF.extra contains
> >>>       id 20: struct task_struct type encoding
> >>>       there is no need to encode more types beyond pointers.
> >>>       this can be simpler or more complex depending
> >>>       on what to do during module load.
> >>>
>
> Right, or in BTF you can use a FWD declaration for task_struct. The
> approach I'm using explicitly identifies types that are only
> pointer-referenced and uses FWDS for them, and this helps keep the
> representation as small as possible.
>
> >>> When a module load:
> >>>     For each .BTF.extra entry, trying to match
> >>>     the corresponding types in the current kernel.
> >>>     The type in the current type should have same
> >>>     size as the one in .BTF.extra if otherwise
> >>>     layout in the module btf may change.
> >>>
> >>>     If new kernel type can be used for module BTF,
> >>>     simply replace the old id with new id in module BTF.
> >>>
> >>>     Otherwise, type mismatch may happen and the corresponding
> >>>     module btf type should be invalidated.
>
> Yep, this is the process I describe as reconciliation; where we make
> sure base BTF at encoding time and current vmlinux BTF are compatible,
> and if so we renumber base BTF references in the module using the
> current vmlinux BTF ids. So if compatible, after reconciliation the
> module BTF looks just like any other module BTF built against that exact
> vmlinux.
>
> >> Yes, I agree, see my reply to Alan. I'm just unsure how strict we want
> >> to be and whether we need to record fields of expected vmlinux BTF
> >> types. Or if just recording expected size would be enough (to ensure
> >> correct memory layout if base BTF type is embedded into module BTF
> >> type).
> >>
> >> Perhaps, if BTF type is referenced from some "trusted" BTF type (used
> >> by kfunc, or in BTF ID set) we might want to enforce strict
> >> compatibility, but for any other type just make sure that size is
> >> correct (if it matters at all; i.e., if base BTF type is referenced by
> >> pointer only, we don't even need to check size).
> >
> > Agree. The above is a good start. I guess some real-world investigations
> > can help shape the actual design about what is the minimum change to
> > make it work.
> >
>
> I'll try and send a pointer to the work-in-progress code prior to the
> BPF office hours next week. In investigating how much info is required,
> for most in-tree modules (which I force-built with minimal BTF) we ended
> up with information about 4000 types or so. So it's a significant
> minimization compared to vmlinux BTF.

4000 is still quite a lot and is a significant fraction of vmlinux BTF
types. I'm curious if you measured the size increase from recording
these types?

>
> In this context, perhaps my describing the information we collect about
> base BTF as minimization is misleading; the intent is really focused not
> on making base BTF small (although of course that's important from a
> practical perspective), but collecting the info about base BTF needed to
> later reconcile it with the running kernel at load time. Maybe
> .BTF.base_expects or something like that might make this clearer? Thanks!
>
> Alan
> >>
> >> WDYT?
> >>
> >>>> Since minimization only keeps used struct and union members, couldn't
> >>>> you have two internal types from different modules which conflict and
> >>>> end up using the wrong offset?
> >>>>
> >>>> Example:
> >>>> in module M:
> >>>> struct S {
> >>>> ... // other unused members
> >>>> int x; // offset 12 (for example)
> >>>> }
> >>>>
> >>>> in module N:
> >>>> struct S {
> >>>> ... // other unused members
> >>>> int x; // offset 20 (something different from S.x in module M)
> >>>> }
> >>>>
> >