Re: [PATCH v4 bpf-next 00/17] Add kind layout, CRCs to BTF

Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> · Wed, 22 Nov 2023 09:42:09 -0800

On Wed, Nov 22, 2023 at 9:00 AM Alan Maguire <alan.maguire@xxxxxxxxxx> wrote:
>
> On 21/11/2023 19:44, Andrii Nakryiko wrote:
> > On Tue, Nov 14, 2023 at 12:20 PM Alan Maguire <alan.maguire@xxxxxxxxxx> wrote:
> >>
> >> hi folks
> >>
> >> I wanted to capture feedback received on the approach described here for
> >> BTF module generation at my talk at LPC [1].
> >>
> >> Stepping back, the aim is to provide a way to generate BTF for a module
> >> such that it is somewhat resilient to minor changes in underlying BTF,
> >> so it does not have to be rebuilt every time vmlinux is built.  The
> >> module references to vmlinux BTF ids are currently very brittle, and
> >> even for the same kernel we get different vmlinux BTF ids if the BTF is
> >> rebuilt.  So the aim is to support a more robust method of module BTF
> >> generation.  Note that the approach described here is not needed for
> >> modules that are built at the same time as the kernel, so it's unlikely
> >> any in-tree modules will need this, but it will be useful for cases such
> >> as where modules are delivered via a package and want to make use
> >> of BTF such that it will not be invalidated.
> >>
> >> Turning to the talk, the general consensus - I think - was that the
> >> standalone BTF approach described in this series was problematic.
> >> Consider kfuncs, if we have, for example, our own definition of a
> >> structure in  standalone module BTF, the BTF id of the local structure
> >> will not match that of the core kernel, which has the potential to
> >> confuse the verifier.
> >>
> >> A similar problem exists for tracing; we would trace an sk_buff in
> >> the module via the module's view of struct sk_buff, but we have no
> >> guarantees that the module's view is still consistent with the vmlinux
> >> representation (which actually allocated it).
> >>
> >> Hopefully I've characterized this correctly; let me know if I missed
> >> something here.
> >
> > Correct.
> >
> >>
> >> So we need some means to both remap BTF ids in the module BTF that refer
> >> to the vmlinux BTF so they point at the right types, _and_ to check the
> >> consistency of the representation of a vmlinux type between module BTF
> >> build time and when it is loaded into the kernel.
> >>
> >> With this in mind, I think a good way forward might be something like
> >> the following:
> >>
> >> For cases where we want more change-independent module BTF - which
> >> is resilient to things like reshuffling of vmlinux BTF ids, and small
> >> changes that don't invalidate structure use completely - we add
> >> a "relocatable" option to the --btf_features list of features for pahole
> >> encoding of module BTF.
> >>
> >> This option would not be needed for modules built at the same time as
> >> the kernel, since the BTF ids and the types they refer to are consistent.
> >>
> >> When used however, it would tell BTF dedup in pahole to add reocation
> >> information as well as generating usual split BTF at the time of module
> >> BTF generation. This relocation information would consist of
> >> descriptions of the BTF types that the module refers to in base BTF and
> >> their dependents. By providing such descriptions, we can then reconcile
> >> the views of types between module and kernel, or if such reconciliation
> >> is impossible, we can refuse to use the BTF. The amount of information
> >> needed for a module will need to be determined, but I'm hopeful in most
> >> cases it would be a small subset of the type information
> >> required for vmlinux as a whole.
> >>
> >> The process of reconciling module and vmlinux BTF at module load time
> >> would then be
> >>
> >> 1. Remap all the split BTF ids representing module-specific types
> >>    and functions to start at last_vmlinux_id + 1. Since the current
> >>    vmlinux may have a different number of types than the vmlinux
> >>    at time of encoding, this remapping is necessary.
> >
> > Correct.
> >
> >>
> >> 2. For each vmlinux type in our list of relocations, check its
> >>    compatibility with the associated vmlinux type.  This is
> >>    somewhat akin to the CO-RE compatibility checks.  Exact rules
> >
> > Not really. CO-RE compatiblity is explicitly very permissive, while
> > here we want to make sure that types are actually memory
> > layout-compatible.
> >
> >>    would need to be ironed out, but a somewhat loose approach
> >>    would be ideal such that a few minor changes in a struct
> >>    somewhere do not totally invalidate module BTF. Unlike CO-RE
> >>    though, field offset changes are _not_ good since they imply the
> >>    module has an incorrect view of the structure and might
> >>    start using fields incorrectly.
> >
> > I think vmlinux type should have at least all the members that module
> > expects, at the same offset, with the same size. Maybe we should allow
> > vmlinux type to get some types at the end, not sure. How hard a
> > requirement it is to accommodate non-exact type matches between
> > vmlinux and kernel module's types?
> >
>
> The main need is to support resilience in the face of small structure
> changes such that the compiled module will still work. When backporting
> fixes to a stable-based kernel - where a version of say 5.15 stable is
> supported for a while and so accumulates stable fixes - often the
> approach used is to use holes in structures for new fields, or if the
> structure is not embedded in any module-specific structures, add fields
> at the end. All existing field offsets should match. In taking that
> approach, the aim is to make sure data accesses in the module are still
> valid - memory layout compatibility is the goal.

So we'll need to develop some checksum/hash that would accommodate
these allowed changes.

>
> >>
> >>    Note that this is a bit easier than BTF deduplication, because
> >>    the deduplication process that happened at module encoding time
> >>    has already done the dependency checking for us; we just need
> >>    to do a type-by-type, 1-to-1 comparison between our relocation
> >>    types and current vmlinux types.
> >>
> >> 3. If all types are consistent, BTF is loaded and we remap the
> >>    module's vmlinux BTF id references to the corresponding
> >>    vmlinux BTF ids of the current vmlinux.
> >
> > Note that we might need to do something special for anonymous types
> > (modifiers, anon enums and structs/unions). Otherwise it's not clear
> > how to even map them between vmlinux BTF and module BTF.
> >
>
> Good point, we'd probably need to represent some sort of parent-child
> relationship to handle cases like this.

Probably best to keep such anonymous types in module's BTF. It might
add a bit of duplication, but will simplify the rest a lot.

>
> >>
> >> I _think_ this gets us what we want; more resilient module BTF,
> >> but with safety checks to ensure compatible representations.
> >> There were some suggestions of using a hashing method, but I think
> >> such a method presupposes we want exact type matches, which I suspect
> >> would be unlikely to be useful in practice as with most stable-based
> >> distros, small changes in types can be made due to fixes etc.
> >
> > What are "small changes" and how are they automatically determined and
> > validated?
> >
>
> See above, field additions in data structure holes or appended to
> structs for the most part. Once I have something rough working
> I'll see how it performs in practice and report back. Thanks!
>

SGTM.

> Alan
>
>
> >>
> >> There were also a suggestion of doing a full dedup, but I think the
> >> consensus in the room (which I agree with) is that would be hard
> >> to do in-kernel.  So the above approach is a compropmise I think;
> >> it gets actual dedup at BTF creation time to create the list of
> >> references and dependents, and we later check them one-by-one on module
> >> load for compatibility.
> >>
> >> Anyway I just wanted to try and capture the feedback received, and
> >> lay out a possible direction. Any further thoughts or suggestions
> >> would be much appreciated. Thanks!
> >>
> >> Alan
> >>
> >> [1] https://lpc.events/event/17/contributions/1576/