On Mon, Jun 5, 2023 at 7:46 PM Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote: > > On Mon, Jun 5, 2023 at 3:38 PM Andrii Nakryiko > <andrii.nakryiko@xxxxxxxxx> wrote: > > > > On Mon, Jun 5, 2023 at 9:15 AM Alexei Starovoitov > > <alexei.starovoitov@xxxxxxxxx> wrote: > > > > > > On Fri, Jun 2, 2023 at 1:34 PM Andrii Nakryiko > > > <andrii.nakryiko@xxxxxxxxx> wrote: > > > > > > > > > > > > > > > > > >> + > > > > > > > > >> +struct btf_kind_meta { > > > > > > > > >> + __u32 name_off; /* kind name string offset */ > > > > > > > > > > > > I'm not sure why we'd need to record this for every KIND? The tool > > > > > > that doesn't know about this new kind can't do much about it anyways, > > > > > > so whether it knows that this is "KIND_NEW_FANCY" or just its ID #123 > > > > > > doesn't make much difference? > > > > > > > > > > The name is certainly more meaningful than 123. > > > > > bpftool output is consumed by humans who will be able to tell the difference. > > > > > I'd keep the name here. > > > > > > > > Ok, it's fine. When I was originally proposing this compact metadata, > > > > I was trying to make it as minimal as possible, so adding 80 bytes > > > > just for string offset fields (plus a bunch of strings) felt like an > > > > unnecessary overhead. But it's not a big deal. > > > > > > Exactly. It's just 4 * num_kinds bytes in and ~20 * num_kinds for > > > names, but it implements 'self description'. > > > Otherwise the names become an external knowledge and BTF is not self described. > > > > > > > > > > > > > > > > > > > > and would bump the BTF_VERSION to 2 to make it a 'milestone'. > > > > > > > > > > > > Bumping BTF_VERSION to 2 automatically makes BTF incompatible with all > > > > > > existing kernels (and potentially many tools that parse BTF). Given we > > > > > > can actually extend BTF in backwards compatible way by just adding an > > > > > > optional two fields to btf_header + extra bytes for metadata sections, > > > > > > why making our lives harder by bumping this version? > > > > > > > > > > I fail to see how bumping the version makes it harder. > > > > > libbpf needs to sanitize meta* fields in the struct btf_header on > > > > > older kernels anway. At the same time sanitizing the version from 2 to > > > > > 1 > > > > > in the same header is one extra line of code in libbpf. > > > > > What am I missing? > > > > > > > > So I checked libbpf code, and libbpf doesn't really check the version > > > > field. So for the most part this BTF_VERSION bump wouldn't matter for > > > > any tool that's based on libbpf's struct btf API. But if libbpf did > > > > check version (as it probably should have), then by upgrading to newer > > > > Clang that would emit BTF with this metadata (but no new fancy > > > > BTF_KIND_KERNEL_FUNC or anything like that), we automatically make > > > > such BTF incompatible with all those tools. > > > > > > > > Kernel is a bit different because it's extremely strict about BTF. I'm > > > > more worried about tools like bpftool (but we don't check BTF_VERSION > > > > there due to libbpf), llvm-objdump (when it supports BTF), etc. > > > > > > > > On the other hand, what do we gain by bumping this BTF_VERSION? > > > > > > The version bump will be an indication that > > > v2 of BTF has enough info in the format for any tool/kernel to consume it. > > > With v2 we should make BTF_KIND_META description mandatory. > > > If we keep it as v1 then the presence of BTF_KIND_META would be > > > an indication of 'self described' format. > > > Which is also ok-ish, but seems less clean. > > > zero vs not-zero of meta_off in btf_header is pretty much v1 vs v2. > > > > > > > We had a long offline discussion w/ Alexei about this whole > > self-describing BTF, and I will try to summarize it here a bit, > > because I think we both think about "self-describing" differently, and > > as a result few different aspects are conflated with each other (there > > are at least 3(!) different things here). > > Thanks for summarizing. All correct. > > > From my perspective, this self-describing BTF metadata was purely > > designed to allow tools without latest BTF knowledge to be able to > > skip over unknown BTF_KIND_xxx, at most being able to tell whether > > it's critical for understanding BTF (that's the OPTIONAL flag) or not. > > I.e., with older bpftool (but the one that knows about btf_metadata, > > of course), it would still be possible to `bpftool btf dump file > > <file-with-newer-btf-kinds>` just fine, except for new KINDS (which > > would be just emitted as "unknown BTF_KIND_XXX, skipping...". > > > > I think this problem is solved with this fixed + per-vlen sz and those > > few extra flags. > > I'm fine with this approach as long as we don't fool ourselves that > we're doing a "self described" format. > We have a "size" field in btf_header. With this btf_metadata extension > we're effectively adding "size" fields for each btf kind and its vlen part. > So what Alan proposed: > +struct btf_kind_meta { > + __u16 flags; /* see BTF_KIND_META_* values above */ > + __u8 info_sz; /* size of singular element after btf_type */ > + __u8 elem_sz; /* size of each of btf_vlen(t) elements */ > +}; > > _without_ name_off it makes the most sense. Yep. I didn't see much need for name_off as well. > > As soon as we're trying to add 'name_off' to the kind we're falling into > the trap of thinking that we're adding "self described" format and > btf_kind_meta needs to actually describe it for printing (with > real name and not just integer id) and further trying to describe > semantics of unknown kind with another flag that Andrii's proposed: > "Another flag I was thinking about was a flag whether struct btf_type's > type/size field is a type or a size (or something else)." > > imo name_off and that other flag in addition to optional_or_not flag > are carrying the concept too far. > > We should just say upfront that this "struct btf_kind_meta" is to be > able to extend BTF easier and nothing else. Yep, that was precisely (at least my) intent. It's great that we are converging on this. > "old" bpftool will be able to skip unknown kinds, but dedup > probably won't be able to skip much anyway. Agreed, I don't think we can ever make BTF dedup work reliably with KINDs it doesn't understand. I wouldn't even try. I'd also say that kernel should keep being strict about this (even if we add "is-it-optional" field, kernel can't trust it). Libbpf and other libraries will have to keep sanitizing BTF anyways. I'm also ok with dropping "optional_or_not" flag. Same for "does-it-point-to-type" flag. I can see some use for the latter (e.g., still being able to calculate sizes and stuff), but it's nothing super critical, IMO. > > I'd also call it "struct btf_kind_description|layout|sizes" I like btf_kind_layout, it's short and to the point. > to narrow the scope. > This BTF extension is not going to describe semantics of unknown kinds. > Instead of "best effort" attempts with flags like "what type/size means" > let's not even go there. > Ack. > If we go this simple route I'm fine with hard coded crc and base_crc > fields. They probably should go to btf_header though. Yep, on btf_header fields. But I'd not hardcode "crc" name. If we are doing them as strings (which I think we should instead of dooming them to 32-bit integer crc32 value only), then can we just say generically that it's either "id" or "checksum"? But I guess crc32 would be fine in practice as well. So not something I strongly feel about. > We don't need "struct btf_metadata" as well. > It's making things sound beyond what it actually is. > btf_header can point to an array of struct btf_kind_description. > As simple as it can get. Agreed. Still, it's a third section, and we should at least have a count of those btf_kind_layout items somewhere. > No need for json like format and key/value things either. > We're not creating a self described BTF format. > We're just adding a few size fields. Ack, great. > The kernel/libbpf/dedup still needs to known semantics of future kinds > to be able to print/operate on them. Yes.