Re: [RFC bpf-next 1/8] btf: add kind metadata encoding to UAPI

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jun 5, 2023 at 3:38 PM Andrii Nakryiko
<andrii.nakryiko@xxxxxxxxx> wrote:
>
> On Mon, Jun 5, 2023 at 9:15 AM Alexei Starovoitov
> <alexei.starovoitov@xxxxxxxxx> wrote:
> >
> > On Fri, Jun 2, 2023 at 1:34 PM Andrii Nakryiko
> > <andrii.nakryiko@xxxxxxxxx> wrote:
> > >
> > > >
> > > > > > > >> +
> > > > > > > >> +struct btf_kind_meta {
> > > > > > > >> +    __u32 name_off;         /* kind name string offset */
> > > > >
> > > > > I'm not sure why we'd need to record this for every KIND? The tool
> > > > > that doesn't know about this new kind can't do much about it anyways,
> > > > > so whether it knows that this is "KIND_NEW_FANCY" or just its ID #123
> > > > > doesn't make much difference?
> > > >
> > > > The name is certainly more meaningful than 123.
> > > > bpftool output is consumed by humans who will be able to tell the difference.
> > > > I'd keep the name here.
> > >
> > > Ok, it's fine. When I was originally proposing this compact metadata,
> > > I was trying to make it as minimal as possible, so adding 80 bytes
> > > just for string offset fields (plus a bunch of strings) felt like an
> > > unnecessary overhead. But it's not a big deal.
> >
> > Exactly. It's just 4 * num_kinds bytes in and ~20 * num_kinds for
> > names, but it implements 'self description'.
> > Otherwise the names become an external knowledge and BTF is not self described.
> >
> >
> > > >
> > > > > > > > and would bump the BTF_VERSION to 2 to make it a 'milestone'.
> > > > >
> > > > > Bumping BTF_VERSION to 2 automatically makes BTF incompatible with all
> > > > > existing kernels (and potentially many tools that parse BTF). Given we
> > > > > can actually extend BTF in backwards compatible way by just adding an
> > > > > optional two fields to btf_header + extra bytes for metadata sections,
> > > > > why making our lives harder by bumping this version?
> > > >
> > > > I fail to see how bumping the version makes it harder.
> > > > libbpf needs to sanitize meta* fields in the struct btf_header on
> > > > older kernels anway. At the same time sanitizing the version from 2 to
> > > > 1
> > > > in the same header is one extra line of code in libbpf.
> > > > What am I missing?
> > >
> > > So I checked libbpf code, and libbpf doesn't really check the version
> > > field. So for the most part this BTF_VERSION bump wouldn't matter for
> > > any tool that's based on libbpf's struct btf API. But if libbpf did
> > > check version (as it probably should have), then by upgrading to newer
> > > Clang that would emit BTF with this metadata (but no new fancy
> > > BTF_KIND_KERNEL_FUNC or anything like that), we automatically make
> > > such BTF incompatible with all those tools.
> > >
> > > Kernel is a bit different because it's extremely strict about BTF. I'm
> > > more worried about tools like bpftool (but we don't check BTF_VERSION
> > > there due to libbpf), llvm-objdump (when it supports BTF), etc.
> > >
> > > On the other hand, what do we gain by bumping this BTF_VERSION?
> >
> > The version bump will be an indication that
> > v2 of BTF has enough info in the format for any tool/kernel to consume it.
> > With v2 we should make BTF_KIND_META description mandatory.
> > If we keep it as v1 then the presence of BTF_KIND_META would be
> > an indication of 'self described' format.
> > Which is also ok-ish, but seems less clean.
> > zero vs not-zero of meta_off in btf_header is pretty much v1 vs v2.
> >
>
> We had a long offline discussion w/ Alexei about this whole
> self-describing BTF, and I will try to summarize it here a bit,
> because I think we both think about "self-describing" differently, and
> as a result few different aspects are conflated with each other (there
> are at least 3(!) different things here).

Thanks for summarizing. All correct.

> From my perspective, this self-describing BTF metadata was purely
> designed to allow tools without latest BTF knowledge to be able to
> skip over unknown BTF_KIND_xxx, at most being able to tell whether
> it's critical for understanding BTF (that's the OPTIONAL flag) or not.
> I.e., with older bpftool (but the one that knows about btf_metadata,
> of course), it would still be possible to `bpftool btf dump file
> <file-with-newer-btf-kinds>` just fine, except for new KINDS (which
> would be just emitted as "unknown BTF_KIND_XXX, skipping...".
>
> I think this problem is solved with this fixed + per-vlen sz and those
> few extra flags.

I'm fine with this approach as long as we don't fool ourselves that
we're doing a "self described" format.
We have a "size" field in btf_header. With this btf_metadata extension
we're effectively adding "size" fields for each btf kind and its vlen part.
So what Alan proposed:
+struct btf_kind_meta {
+       __u16 flags;            /* see BTF_KIND_META_* values above */
+       __u8 info_sz;           /* size of singular element after btf_type */
+       __u8 elem_sz;           /* size of each of btf_vlen(t) elements */
+};

_without_ name_off it makes the most sense.

As soon as we're trying to add 'name_off' to the kind we're falling into
the trap of thinking that we're adding "self described" format and
btf_kind_meta needs to actually describe it for printing (with
real name and not just integer id) and further trying to describe
semantics of unknown kind with another flag that Andrii's proposed:
"Another flag I was thinking about was a flag whether struct btf_type's
type/size field is a type or a size (or something else)."

imo name_off and that other flag in addition to optional_or_not flag
are carrying the concept too far.

We should just say upfront that this "struct btf_kind_meta" is to be
able to extend BTF easier and nothing else.
"old" bpftool will be able to skip unknown kinds, but dedup
probably won't be able to skip much anyway.

I'd also call it "struct btf_kind_description|layout|sizes"
to narrow the scope.
This BTF extension is not going to describe semantics of unknown kinds.
Instead of "best effort" attempts with flags like "what type/size means"
let's not even go there.

If we go this simple route I'm fine with hard coded crc and base_crc
fields. They probably should go to btf_header though.
We don't need "struct btf_metadata" as well.
It's making things sound beyond what it actually is.
btf_header can point to an array of struct btf_kind_description.
As simple as it can get.
No need for json like format and key/value things either.
We're not creating a self described BTF format.
We're just adding a few size fields.
The kernel/libbpf/dedup still needs to known semantics of future kinds
to be able to print/operate on them.





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux