On Thu, Jun 1, 2023 at 3:38 AM Alan Maguire <alan.maguire@xxxxxxxxxx> wrote: > > On 01/06/2023 04:53, Alexei Starovoitov wrote: > > On Wed, May 31, 2023 at 09:19:28PM +0100, Alan Maguire wrote: > >> BTF kind metadata provides information to parse BTF kinds. > >> By separating parsing BTF from using all the information > >> it provides, we allow BTF to encode new features even if > >> they cannot be used. This is helpful in particular for > >> cases where newer tools for BTF generation run on an > >> older kernel; BTF kinds may be present that the kernel > >> cannot yet use, but at least it can parse the BTF > >> provided. Meanwhile userspace tools with newer libbpf > >> may be able to use the newer information. > >> > >> The intent is to support encoding of kind metadata > >> optionally so that tools like pahole can add this > >> information. So for each kind we record > >> > >> - a kind name string > >> - kind-related flags > >> - length of singular element following struct btf_type > >> - length of each of the btf_vlen() elements following > >> > >> In addition we make space in the metadata for > >> CRC32s computed over the BTF along with a CRC for > >> the base BTF; this allows split BTF to identify > >> a mismatch explicitly. Finally we provide an > >> offset for an optional description string. > >> > >> The ideas here were discussed at [1] hence > >> > >> Suggested-by: Andrii Nakryiko <andrii@xxxxxxxxxx> > >> Signed-off-by: Alan Maguire <alan.maguire@xxxxxxxxxx> > >> > >> [1] https://lore.kernel.org/bpf/CAEf4BzYjWHRdNNw4B=eOXOs_ONrDwrgX4bn=Nuc1g8JPFC34MA@xxxxxxxxxxxxxx/ > >> --- > >> include/uapi/linux/btf.h | 29 +++++++++++++++++++++++++++++ > >> tools/include/uapi/linux/btf.h | 29 +++++++++++++++++++++++++++++ > >> 2 files changed, 58 insertions(+) > >> > >> diff --git a/include/uapi/linux/btf.h b/include/uapi/linux/btf.h > >> index ec1798b6d3ff..94c1f4518249 100644 > >> --- a/include/uapi/linux/btf.h > >> +++ b/include/uapi/linux/btf.h > >> @@ -8,6 +8,34 @@ > >> #define BTF_MAGIC 0xeB9F > >> #define BTF_VERSION 1 > >> > >> +/* is this information required? If so it cannot be sanitized safely. */ > >> +#define BTF_KIND_META_OPTIONAL (1 << 0) > >> + > >> +struct btf_kind_meta { > >> + __u32 name_off; /* kind name string offset */ > >> + __u16 flags; /* see BTF_KIND_META_* values above */ > >> + __u8 info_sz; /* size of singular element after btf_type */ > >> + __u8 elem_sz; /* size of each of btf_vlen(t) elements */ > >> +}; > >> + > >> +/* for CRCs for BTF, base BTF to be considered usable, flags must be set. */ > >> +#define BTF_META_CRC_SET (1 << 0) > >> +#define BTF_META_BASE_CRC_SET (1 << 1) > >> + > >> +struct btf_metadata { > >> + __u8 kind_meta_cnt; /* number of struct btf_kind_meta */ > >> + __u32 flags; > >> + __u32 description_off; /* optional description string */ > >> + __u32 crc; /* crc32 of BTF */ > >> + __u32 base_crc; /* crc32 of base BTF */ > >> + struct btf_kind_meta kind_meta[]; > >> +}; > >> + > >> +struct btf_meta_header { > >> + __u32 meta_off; /* offset of metadata section */ > >> + __u32 meta_len; /* length of metadata section */ > >> +}; > >> + > >> struct btf_header { > >> __u16 magic; > >> __u8 version; > >> @@ -19,6 +47,7 @@ struct btf_header { > >> __u32 type_len; /* length of type section */ > >> __u32 str_off; /* offset of string section */ > >> __u32 str_len; /* length of string section */ > >> + struct btf_meta_header meta_header; > >> }; > >> > >> /* Max # of type identifier */ > >> diff --git a/tools/include/uapi/linux/btf.h b/tools/include/uapi/linux/btf.h > >> index ec1798b6d3ff..94c1f4518249 100644 > >> --- a/tools/include/uapi/linux/btf.h > >> +++ b/tools/include/uapi/linux/btf.h > >> @@ -8,6 +8,34 @@ > >> #define BTF_MAGIC 0xeB9F > >> #define BTF_VERSION 1 > >> > >> +/* is this information required? If so it cannot be sanitized safely. */ > >> +#define BTF_KIND_META_OPTIONAL (1 << 0) > >> + > >> +struct btf_kind_meta { > >> + __u32 name_off; /* kind name string offset */ > >> + __u16 flags; /* see BTF_KIND_META_* values above */ > >> + __u8 info_sz; /* size of singular element after btf_type */ > >> + __u8 elem_sz; /* size of each of btf_vlen(t) elements */ > >> +}; > >> + > >> +/* for CRCs for BTF, base BTF to be considered usable, flags must be set. */ > >> +#define BTF_META_CRC_SET (1 << 0) > >> +#define BTF_META_BASE_CRC_SET (1 << 1) > >> + > >> +struct btf_metadata { > >> + __u8 kind_meta_cnt; /* number of struct btf_kind_meta */ > > > > Overall, looks great. > > Few small nits: > > I'd make kind_meta_cnt u32, since padding we won't be able to reuse anyway > > and would bump the BTF_VERSION to 2 to make it a 'milestone'. > > v2 -> self described. > > sure, sounds good. One other change perhaps worth making; currently > we assume that the kind metadata is at the end of the struct > btf_metadata, but if we ever wanted to add metadata fields in the > future, we'd want so support both the current metadata structure and > any future structure which had additional fields. > > With that in mind, it might make sense to go with something like > > struct btf_metadata { > __u32 kind_meta_cnt; > __u32 kind_meta_offset; /* kind_meta_cnt instances of struct > btf_kind_meta start here */ > __u32 flags; > __u32 description_off; /* optional description string*/ > __u32 crc; /* crc32 of BTF */ > __u32 base_crc; /* crc32 of base BTF */ > }; > > For the original version, kind_meta_offset would just be > at meta_off + sizeof(struct btf_metadata), but if we had multiple > versions of the btf_metadata header to handle, they could all rely on > the kind_meta_offset being where kind metadata is stored. > For validation we'd have to make sure kind_meta_offset was within > the the metadata header range. kind_meta_offset is an ok idea, but I don't quite see why we'd have multiple 'struct btf_metadata' pointing to the same set of 'struct btf_kind_meta'. Also why do we need description_off ? Shouldn't string go into btf_header->str_off ? > > > >> + __u32 flags; > >> + __u32 description_off; /* optional description string */ > >> + __u32 crc; /* crc32 of BTF */ > >> + __u32 base_crc; /* crc32 of base BTF */ > > > > Hard coded CRC also gives me a pause. > > Should it be an optional KIND like btf tags? > > The goal of the CRC is really just to provide a unique identifier that > we can use for things like checking if there's a mismatch between > base and module BTF. If we want to ever do CRC validation (not sure > if there's a case for that) we probably need to think about cases like > BTF sanitization of BPF program BTF; this would likely only be an > issue if metadata support is added to BPF compilers. > > The problem with adding it via a kind is that if we first compute > the CRC over the entire BTF object and then add the kind, the addition > of the kind breaks the CRC; as a result I _think_ we're stuck with > having to have it in the header. Hmm. libbpf can add BTF_KIND_CRC with zero-ed u32 crc field and later fill it in. It's really not different than u32 crc field inside 'struct btf_metadata'. > That said I don't think CRC is necessarily the only identifier > we could use, and we don't even need to identify it as a > CRC in the UAPI, just as a "unique identifier"; that would deal > with issues about breaking the CRC during sanitization. All > depends on whether we ever see a need to verify BTF via CRC > really. Right. It could be sha or anything else, but user space and kernel need to agree on the math to compute it, so something got to indicate that this 32-bit is a crc. Hence KIND_CRC, KIND_SHA fit better.