On Fri, Jun 2, 2023 at 9:32 AM Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> wrote: > > On Thu, Jun 1, 2023 at 9:54 AM Alexei Starovoitov > <alexei.starovoitov@xxxxxxxxx> wrote: > > > > On Thu, Jun 1, 2023 at 3:38 AM Alan Maguire <alan.maguire@xxxxxxxxxx> wrote: > > > > > > On 01/06/2023 04:53, Alexei Starovoitov wrote: > > > > On Wed, May 31, 2023 at 09:19:28PM +0100, Alan Maguire wrote: > > > >> BTF kind metadata provides information to parse BTF kinds. > > > >> By separating parsing BTF from using all the information > > > >> it provides, we allow BTF to encode new features even if > > > >> they cannot be used. This is helpful in particular for > > > >> cases where newer tools for BTF generation run on an > > > >> older kernel; BTF kinds may be present that the kernel > > > >> cannot yet use, but at least it can parse the BTF > > > >> provided. Meanwhile userspace tools with newer libbpf > > > >> may be able to use the newer information. > > > >> > > > >> The intent is to support encoding of kind metadata > > > >> optionally so that tools like pahole can add this > > > >> information. So for each kind we record > > > >> > > > >> - a kind name string > > > >> - kind-related flags > > > >> - length of singular element following struct btf_type > > > >> - length of each of the btf_vlen() elements following > > > >> > > > >> In addition we make space in the metadata for > > > >> CRC32s computed over the BTF along with a CRC for > > > >> the base BTF; this allows split BTF to identify > > > >> a mismatch explicitly. Finally we provide an > > > >> offset for an optional description string. > > > >> > > > >> The ideas here were discussed at [1] hence > > > >> > > > >> Suggested-by: Andrii Nakryiko <andrii@xxxxxxxxxx> > > > >> Signed-off-by: Alan Maguire <alan.maguire@xxxxxxxxxx> > > > >> > > > >> [1] https://lore.kernel.org/bpf/CAEf4BzYjWHRdNNw4B=eOXOs_ONrDwrgX4bn=Nuc1g8JPFC34MA@xxxxxxxxxxxxxx/ > > > >> --- > > > >> include/uapi/linux/btf.h | 29 +++++++++++++++++++++++++++++ > > > >> tools/include/uapi/linux/btf.h | 29 +++++++++++++++++++++++++++++ > > > >> 2 files changed, 58 insertions(+) > > > >> > > > >> diff --git a/include/uapi/linux/btf.h b/include/uapi/linux/btf.h > > > >> index ec1798b6d3ff..94c1f4518249 100644 > > > >> --- a/include/uapi/linux/btf.h > > > >> +++ b/include/uapi/linux/btf.h > > > >> @@ -8,6 +8,34 @@ > > > >> #define BTF_MAGIC 0xeB9F > > > >> #define BTF_VERSION 1 > > > >> > > > >> +/* is this information required? If so it cannot be sanitized safely. */ > > > >> +#define BTF_KIND_META_OPTIONAL (1 << 0) > > Another flag I was thinking about was a flag whether struct btf_type's > type/size field is a type or a size (or something else). E.g., let's > say we haven't had btf_type_tag yet and were adding it after we had > this new metadata. We could say that type_tag's type/size field is > actually a type ID, and generic tools like bpftool could basically > skip type_tag and resolve to underlying type. This way, optional > modifier/decorator KINDs won't even have to break applications using > old libbpf's when it comes to calculating type sizes and resolving > them. +1 > > > >> + > > > >> +struct btf_kind_meta { > > > >> + __u32 name_off; /* kind name string offset */ > > I'm not sure why we'd need to record this for every KIND? The tool > that doesn't know about this new kind can't do much about it anyways, > so whether it knows that this is "KIND_NEW_FANCY" or just its ID #123 > doesn't make much difference? The name is certainly more meaningful than 123. bpftool output is consumed by humans who will be able to tell the difference. I'd keep the name here. > > > > and would bump the BTF_VERSION to 2 to make it a 'milestone'. > > Bumping BTF_VERSION to 2 automatically makes BTF incompatible with all > existing kernels (and potentially many tools that parse BTF). Given we > can actually extend BTF in backwards compatible way by just adding an > optional two fields to btf_header + extra bytes for metadata sections, > why making our lives harder by bumping this version? I fail to see how bumping the version makes it harder. libbpf needs to sanitize meta* fields in the struct btf_header on older kernels anway. At the same time sanitizing the version from 2 to 1 in the same header is one extra line of code in libbpf. What am I missing? > > > > > v2 -> self described. > > > > > > sure, sounds good. One other change perhaps worth making; currently > > > we assume that the kind metadata is at the end of the struct > > > btf_metadata, but if we ever wanted to add metadata fields in the > > > future, we'd want so support both the current metadata structure and > > > any future structure which had additional fields. > > see above, another reason to make metadata a separate section, in > addition to types and strings > > > > > > > With that in mind, it might make sense to go with something like > > > > > > struct btf_metadata { > > > __u32 kind_meta_cnt; > > > __u32 kind_meta_offset; /* kind_meta_cnt instances of struct > > > btf_kind_meta start here */ > > > __u32 flags; > > > __u32 description_off; /* optional description string*/ > > > __u32 crc; /* crc32 of BTF */ > > > __u32 base_crc; /* crc32 of base BTF */ > > > }; > > > > > > For the original version, kind_meta_offset would just be > > > at meta_off + sizeof(struct btf_metadata), but if we had multiple > > > versions of the btf_metadata header to handle, they could all rely on > > > the kind_meta_offset being where kind metadata is stored. > > > For validation we'd have to make sure kind_meta_offset was within > > > the the metadata header range. > > > > kind_meta_offset is an ok idea, but I don't quite see why we'd have > > multiple 'struct btf_metadata' pointing to the same set of 'struct > > btf_kind_meta'. > > > > Also why do we need description_off ? Shouldn't string go into > > btf_header->str_off ? > > > > > > > > > >> + __u32 flags; > > > >> + __u32 description_off; /* optional description string */ > > > >> + __u32 crc; /* crc32 of BTF */ > > > >> + __u32 base_crc; /* crc32 of base BTF */ > > > > > > > > Hard coded CRC also gives me a pause. > > > > Should it be an optional KIND like btf tags? > > > > > > The goal of the CRC is really just to provide a unique identifier that > > > we can use for things like checking if there's a mismatch between > > > base and module BTF. If we want to ever do CRC validation (not sure > > > if there's a case for that) we probably need to think about cases like > > > BTF sanitization of BPF program BTF; this would likely only be an > > > issue if metadata support is added to BPF compilers. > > > > > > The problem with adding it via a kind is that if we first compute > > > the CRC over the entire BTF object and then add the kind, the addition > > > of the kind breaks the CRC; as a result I _think_ we're stuck with > > > having to have it in the header. > > > > Hmm. libbpf can add BTF_KIND_CRC with zero-ed u32 crc field > > and later fill it in. > > It's really not different than u32 crc field inside 'struct btf_metadata'. > > > > > That said I don't think CRC is necessarily the only identifier > > > we could use, and we don't even need to identify it as a > > > CRC in the UAPI, just as a "unique identifier"; that would deal > > > with issues about breaking the CRC during sanitization. All > > > depends on whether we ever see a need to verify BTF via CRC > > > really. > > > > Right. It could be sha or anything else, but user space and kernel > > need to agree on the math to compute it, so something got to indicate > > that this 32-bit is a crc. > > Hence KIND_CRC, KIND_SHA fit better. > > what if instead of crc and base_src fields, we have > > __u32 id_str_off; > __u32 base_id_str_off; > > and they are offsets into a string section. We can then define that > those strings have to be something like "crc:<crc-value>" or > "sha:<sha-value". This will be a generic ID, and extensible (and more > easily extensible, probably), but won't require new KIND. Encoding binary data in strings with \0 and other escape chars? Ouch. Please no. We can have variable size KIND_ID and encode crc vs sha in flags, but binary data better stay binary. > This also has a good property that 0 means "no ID", which helps with > the base BTF case. Current "__u32 crc;" doesn't have this property and > requires a flag. imo this crc addition is a litmus test for this self-described format. If we cannot encode it as a new KIND* it means this self-described idea is broken.