Re: [PATCH v4 bpf-next 01/11] libbpf: add btf__distill_base() creating split BTF with distilled base BTF

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, May 22, 2024 at 9:42 AM Alan Maguire <alan.maguire@xxxxxxxxxx> wrote:
>
> On 21/05/2024 22:48, Andrii Nakryiko wrote:
> > On Fri, May 17, 2024 at 3:23 AM Alan Maguire <alan.maguire@xxxxxxxxxx> wrote:
> >>
> >> To support more robust split BTF, adding supplemental context for the
> >> base BTF type ids that split BTF refers to is required.  Without such
> >> references, a simple shuffling of base BTF type ids (without any other
> >> significant change) invalidates the split BTF.  Here the attempt is made
> >> to store additional context to make split BTF more robust.
> >>
> >> This context comes in the form of distilled base BTF providing minimal
> >> information (name and - in some cases - size) for base INTs, FLOATs,
> >> STRUCTs, UNIONs, ENUMs and ENUM64s along with modified split BTF that
> >> points at that base and contains any additional types needed (such as
> >> TYPEDEF, PTR and anonymous STRUCT/UNION declarations).  This
> >> information constitutes the minimal BTF representation needed to
> >> disambiguate or remove split BTF references to base BTF.  The rules
> >> are as follows:
> >>
> >> - INT, FLOAT are recorded in full.
> >> - if a named base BTF STRUCT or UNION is referred to from split BTF, it
> >>   will be encoded either as a zero-member sized STRUCT/UNION (preserving
> >>   size for later relocation checks) or as a named FWD.  Only base BTF
> >>   STRUCT/UNIONs that are either embedded in split BTF STRUCT/UNIONs or
> >>   that have multiple STRUCT/UNION instances of the same name need to
> >>   preserve size information, so a FWD representation will be used in
> >>   most cases.
> >> - if an ENUM[64] is named, a ENUM forward representation (an ENUM
> >>   with no values) is used.
> >> - in all other cases, the type is added to the new split BTF.
> >>
> >> Avoiding struct/union/enum/enum64 expansion is important to keep the
> >> distilled base BTF representation to a minimum size.
> >>
> >> When successful, new representations of the distilled base BTF and new
> >> split BTF that refers to it are returned.  Both need to be freed by the
> >> caller.
> >>
> >> So to take a simple example, with split BTF with a type referring
> >> to "struct sk_buff", we will generate distilled base BTF with a
> >> FWD struct sk_buff, and the split BTF will refer to it instead.
> >>
> >> Tools like pahole can utilize such split BTF to populate the .BTF
> >> section (split BTF) and an additional .BTF.base section.  Then
> >> when the split BTF is loaded, the distilled base BTF can be used
> >> to relocate split BTF to reference the current (and possibly changed)
> >> base BTF.
> >>
> >> So for example if "struct sk_buff" was id 502 when the split BTF was
> >> originally generated,  we can use the distilled base BTF to see that
> >> id 502 refers to a "struct sk_buff" and replace instances of id 502
> >> with the current (relocated) base BTF sk_buff type id.
> >>
> >> Distilled base BTF is small; when building a kernel with all modules
> >> using distilled base BTF as a test, ovreall module size grew by only
> >
> > typo: overall
> >
> >> 5.3Mb total across ~2700 modules.
> >>
> >> Signed-off-by: Alan Maguire <alan.maguire@xxxxxxxxxx>
> >> ---
> >>  tools/lib/bpf/btf.c      | 409 ++++++++++++++++++++++++++++++++++++++-
> >>  tools/lib/bpf/btf.h      |  20 ++
> >>  tools/lib/bpf/libbpf.map |   1 +
> >>  3 files changed, 424 insertions(+), 6 deletions(-)
> >>

[...]

> >> +#define BTF_ID(id)             (id & ~BTF_NEEDS_SIZE)
> >> +
> >> +struct btf_distill {
> >> +       struct btf_pipe pipe;
> >> +       int *ids;
> >
> > reading the rest of the code, this BTF_NEEDS_SIZE and BTF_ID() macro
> > use was quite distracting. I'm wondering if it would be better to use
> > a simple struct with bitfields here, e.g.,
> >
> > struct type_state {
> >     int id: 31;
> >     bool needs_size;
> > };
> >
> > struct dist_state *states;
> >
> > Same memory usage, same efficiency, but more readable code. WDYT?
> >
>
> Yeah, I saw Eduard used that approach elsewhere, it's much neater.
> However in other discussion with Eduard we talked about moving the
> embedded/duplicate validation to relocation time. The motivation for
> that is that if we record sizes for all STRUCTs and UNIONs, we can then
> be selective about size checks at relocation time.  This is particularly
> relevant for the duplicate name case since it's possible a name either
> is no longer duplicate in the new vmlinux, or in the new vmlinux has a
> duplicate where the original vmlinux did not. In other words it's the
> "new world" of the vmlinux we're trying to relocate with that we're
> really concerned with, so preserving all size information until we see
> that new vmlinux and can spot duplicates and embedded types where the
> size checks need to be enforced makes sense I think.

Sure, I think it makes sense (though let's see how much more
complexity we add to relocation code).

>
> In that context, we mark embedded types prior to assigning mappings, and
> don't mark duplicate names at all in the map since it is duplicates in
> the new vmlinux we're interested in. So the way I'd approached it is to
> have a BTF_IS_EMBEDDED (casting -1) value that the map values would set
> when they found a split BTF struct/union with an embedded base BTF type.
> That later gets overwritten when we do the map assignments so it never
> gets exposed to the user and we can still return a __u32 array of BTF
> ids to the caller.

My only question/concern is duplicate named entries in distilled base.
How will that work with binary search (at least the exact variant you
are using). But I'm not sure I understand all the nuances of what you
agreed on.

[...]

> >> +
> >> +       return 0;
> >> +}
> >> +
> >> +/* Check if composite type has a duplicate-named type; if it does, retain
> >
> > see above about check vs mark, here at least the function name uses "mark" :)
> >
> >> + * size information to help guide later relocation towards the correct type.
> >> + * For example there are duplicate 'dma_chan' structs in vmlinux BTF;
> >> + * one is size 112, the other 16.  Having size information allows relocation
> >> + * to choose the right one.
> >
> > re: this dma_chan and similar cases. Is it ever a problem where a
> > module actually needs two dma_chan in distilled base BTF? Name
> > conflicts do happen, but intuitively I'd expect this to happen between
> > some vmlinux-internal (so to speak, i.e., not the type used in
> > exported functions/types) and kernel module-specific types. It would
> > be awkward for the same module to use two different types that are
> > named the same.
> >
> > Have you seen this as a problem in practice? What if for now we just
> > error out if there are two conflicting types required in distilled
> > BTF?
>
> Funnily enough I ran into it with "dma_chan" itself when trying to load
> an in-tree module which I'd forced (along with the rest of the tree) to
> be built with distilled base BTF.  And it was also an embedded struct
> too, so we got 2 for 1 there ;-)
>
> But it does seem like it's a legitimate problem, so if we can address it
> I think it'd be worth trying. Even the size checks (applied at
> relocation time) would be better than nothing I think.
>
> The only other way I could think that we could disambiguate things would
> be to add the duplicate-named type to the split BTF (as we do with the
> anonymous structs).  That would remove the ambiguity, but expose us to
> the risk of having to pull in a lot more types. I can't imagine a
> duplicate-named type would ever figure in a kfunc signature, so it
> should be safe to do. But I think either way we probably have to have
> some sort of answer for this problem.
>

Ok, just keep in mind that duplicated names in distilled base BTF do
cause issues when "joining" actual vmlinux BTF and distilled base BTF.

[...]

> >> +
> >> +       /* struct/union members not needed, except for anonymous structs
> >> +        * and unions, which we need since name won't help us determine
> >> +        * matches; so if a named struct/union, no need to recurse
> >> +        * into members.
> >> +        */
> >
> > is this an outdated comment? we shouldn't have anonymous types in the
> > distilled base, right?
>
> Yep, they go into split BTF. However, we might need to add their members
> to our distilled base; for example if we hadn't yet added an int and had
>
>         struct {
>                 int field;
>         };
>
> ...we'd want to make sure we added an INT to our distilled base. I'll
> try and clarify the comment a bit more.

I see, it makes sense. Yes, let's add a comment, it's a bit subtle.

>
> >
> >> +       if (btf_is_eligible_named_fwd(t))
> >> +               return 0;
> >> +

[...]

> >> +                               err = btf__add_enum(dist->pipe.dst, name, t->size);
> >> +                               break;
> >> +                       default:
> >> +                               pr_warn("unexpected kind [%u] when creating distilled base BTF.\n",
> >> +                                       btf_kind(t));
> >> +                               return -EINVAL;
> >> +                       }
> >> +               } else {
> >> +                       err = btf_add_type(&dist->pipe, t);
> >
> > So this should never happen if adding_to_base == true, is that right?
> > Can we check this? Or am I missing something as usual?
> >
>
> We're currently adding INTs and FLOATs to the base also, so they are two
> cases where we end up here I think.

let's have them as another explicit case and then warn for everything
we don't expect to be added? I'm trying to prevent silent problems we
haven't thought about or that will appear in the future due to BTF
extension, so let's be conservative everywhere.

>
> >> +               }
> >> +               if (err < 0)
> >> +                       break;
> >> +               dist->ids[i] = id++;
> >> +       }
> >> +       return err;
> >> +}
> >> +
> >

[...]





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux