Re: [RFC bpf-next 1/8] btf: add kind metadata encoding to UAPI

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jun 2, 2023 at 11:12 AM Alexei Starovoitov
<alexei.starovoitov@xxxxxxxxx> wrote:
>
> On Fri, Jun 2, 2023 at 9:32 AM Andrii Nakryiko
> <andrii.nakryiko@xxxxxxxxx> wrote:
> >
> > On Thu, Jun 1, 2023 at 9:54 AM Alexei Starovoitov
> > <alexei.starovoitov@xxxxxxxxx> wrote:
> > >
> > > On Thu, Jun 1, 2023 at 3:38 AM Alan Maguire <alan.maguire@xxxxxxxxxx> wrote:
> > > >
> > > > On 01/06/2023 04:53, Alexei Starovoitov wrote:
> > > > > On Wed, May 31, 2023 at 09:19:28PM +0100, Alan Maguire wrote:
> > > > >> BTF kind metadata provides information to parse BTF kinds.
> > > > >> By separating parsing BTF from using all the information
> > > > >> it provides, we allow BTF to encode new features even if
> > > > >> they cannot be used.  This is helpful in particular for
> > > > >> cases where newer tools for BTF generation run on an
> > > > >> older kernel; BTF kinds may be present that the kernel
> > > > >> cannot yet use, but at least it can parse the BTF
> > > > >> provided.  Meanwhile userspace tools with newer libbpf
> > > > >> may be able to use the newer information.
> > > > >>
> > > > >> The intent is to support encoding of kind metadata
> > > > >> optionally so that tools like pahole can add this
> > > > >> information.  So for each kind we record
> > > > >>
> > > > >> - a kind name string
> > > > >> - kind-related flags
> > > > >> - length of singular element following struct btf_type
> > > > >> - length of each of the btf_vlen() elements following
> > > > >>
> > > > >> In addition we make space in the metadata for
> > > > >> CRC32s computed over the BTF along with a CRC for
> > > > >> the base BTF; this allows split BTF to identify
> > > > >> a mismatch explicitly.  Finally we provide an
> > > > >> offset for an optional description string.
> > > > >>
> > > > >> The ideas here were discussed at [1] hence
> > > > >>
> > > > >> Suggested-by: Andrii Nakryiko <andrii@xxxxxxxxxx>
> > > > >> Signed-off-by: Alan Maguire <alan.maguire@xxxxxxxxxx>
> > > > >>
> > > > >> [1] https://lore.kernel.org/bpf/CAEf4BzYjWHRdNNw4B=eOXOs_ONrDwrgX4bn=Nuc1g8JPFC34MA@xxxxxxxxxxxxxx/
> > > > >> ---
> > > > >>  include/uapi/linux/btf.h       | 29 +++++++++++++++++++++++++++++
> > > > >>  tools/include/uapi/linux/btf.h | 29 +++++++++++++++++++++++++++++
> > > > >>  2 files changed, 58 insertions(+)
> > > > >>
> > > > >> diff --git a/include/uapi/linux/btf.h b/include/uapi/linux/btf.h
> > > > >> index ec1798b6d3ff..94c1f4518249 100644
> > > > >> --- a/include/uapi/linux/btf.h
> > > > >> +++ b/include/uapi/linux/btf.h
> > > > >> @@ -8,6 +8,34 @@
> > > > >>  #define BTF_MAGIC   0xeB9F
> > > > >>  #define BTF_VERSION 1
> > > > >>
> > > > >> +/* is this information required? If so it cannot be sanitized safely. */
> > > > >> +#define BTF_KIND_META_OPTIONAL              (1 << 0)
> >
> > Another flag I was thinking about was a flag whether struct btf_type's
> > type/size field is a type or a size (or something else). E.g., let's
> > say we haven't had btf_type_tag yet and were adding it after we had
> > this new metadata. We could say that type_tag's type/size field is
> > actually a type ID, and generic tools like bpftool could basically
> > skip type_tag and resolve to underlying type. This way, optional
> > modifier/decorator KINDs won't even have to break applications using
> > old libbpf's when it comes to calculating type sizes and resolving
> > them.
>
> +1
>
> > > > >> +
> > > > >> +struct btf_kind_meta {
> > > > >> +    __u32 name_off;         /* kind name string offset */
> >
> > I'm not sure why we'd need to record this for every KIND? The tool
> > that doesn't know about this new kind can't do much about it anyways,
> > so whether it knows that this is "KIND_NEW_FANCY" or just its ID #123
> > doesn't make much difference?
>
> The name is certainly more meaningful than 123.
> bpftool output is consumed by humans who will be able to tell the difference.
> I'd keep the name here.

Ok, it's fine. When I was originally proposing this compact metadata,
I was trying to make it as minimal as possible, so adding 80 bytes
just for string offset fields (plus a bunch of strings) felt like an
unnecessary overhead. But it's not a big deal.

>
> > > > > and would bump the BTF_VERSION to 2 to make it a 'milestone'.
> >
> > Bumping BTF_VERSION to 2 automatically makes BTF incompatible with all
> > existing kernels (and potentially many tools that parse BTF). Given we
> > can actually extend BTF in backwards compatible way by just adding an
> > optional two fields to btf_header + extra bytes for metadata sections,
> > why making our lives harder by bumping this version?
>
> I fail to see how bumping the version makes it harder.
> libbpf needs to sanitize meta* fields in the struct btf_header on
> older kernels anway. At the same time sanitizing the version from 2 to
> 1
> in the same header is one extra line of code in libbpf.
> What am I missing?

So I checked libbpf code, and libbpf doesn't really check the version
field. So for the most part this BTF_VERSION bump wouldn't matter for
any tool that's based on libbpf's struct btf API. But if libbpf did
check version (as it probably should have), then by upgrading to newer
Clang that would emit BTF with this metadata (but no new fancy
BTF_KIND_KERNEL_FUNC or anything like that), we automatically make
such BTF incompatible with all those tools.

Kernel is a bit different because it's extremely strict about BTF. I'm
more worried about tools like bpftool (but we don't check BTF_VERSION
there due to libbpf), llvm-objdump (when it supports BTF), etc.

On the other hand, what do we gain by bumping this BTF_VERSION?

>
> >
> > > > > v2 -> self described.
> > > >
> > > > sure, sounds good. One other change perhaps worth making; currently
> > > > we assume that the kind metadata is at the end of the struct
> > > > btf_metadata, but if we ever wanted to add metadata fields in the
> > > > future, we'd want so support both the current metadata structure and
> > > > any future structure which had additional fields.
> >
> > see above, another reason to make metadata a separate section, in
> > addition to types and strings
> >
> > > >
> > > > With that in mind, it might make sense to go with something like
> > > >
> > > > struct btf_metadata {
> > > >         __u32   kind_meta_cnt;
> > > >         __u32   kind_meta_offset;       /* kind_meta_cnt instances of struct
> > > > btf_kind_meta start here */
> > > >         __u32   flags;
> > > >         __u32   description_off;        /* optional description string*/
> > > >         __u32   crc;                    /* crc32 of BTF */
> > > >         __u32   base_crc;               /* crc32 of base BTF */
> > > > };
> > > >
> > > > For the original version, kind_meta_offset would just be
> > > > at meta_off + sizeof(struct btf_metadata), but if we had multiple
> > > > versions of the btf_metadata header to handle, they could all rely on
> > > > the kind_meta_offset being where kind metadata is stored.
> > > > For validation we'd have to make sure kind_meta_offset was within
> > > > the the metadata header range.
> > >
> > > kind_meta_offset is an ok idea, but I don't quite see why we'd have
> > > multiple 'struct btf_metadata' pointing to the same set of 'struct
> > > btf_kind_meta'.
> > >
> > > Also why do we need description_off ? Shouldn't string go into
> > > btf_header->str_off ?
> > >
> > > > >
> > > > >> +    __u32   flags;
> > > > >> +    __u32   description_off;        /* optional description string */
> > > > >> +    __u32   crc;                    /* crc32 of BTF */
> > > > >> +    __u32   base_crc;               /* crc32 of base BTF */
> > > > >
> > > > > Hard coded CRC also gives me a pause.
> > > > > Should it be an optional KIND like btf tags?
> > > >
> > > > The goal of the CRC is really just to provide a unique identifier that
> > > > we can use for things like checking if there's a mismatch between
> > > > base and module BTF. If we want to ever do CRC validation (not sure
> > > > if there's a case for that) we probably need to think about cases like
> > > > BTF sanitization of BPF program BTF; this would likely only be an
> > > > issue if metadata support is added to BPF compilers.
> > > >
> > > > The problem with adding it via a kind is that if we first compute
> > > > the CRC over the entire BTF object and then add the kind, the addition
> > > > of the kind breaks the CRC; as a result I _think_ we're stuck with
> > > > having to have it in the header.
> > >
> > > Hmm. libbpf can add BTF_KIND_CRC with zero-ed u32 crc field
> > > and later fill it in.
> > > It's really not different than u32 crc field inside 'struct btf_metadata'.
> > >
> > > > That said I don't think CRC is necessarily the only identifier
> > > > we could use, and we don't even need to identify it as a
> > > > CRC in the UAPI, just as a "unique identifier"; that would deal
> > > > with issues about breaking the CRC during sanitization. All
> > > > depends on whether we ever see a need to verify BTF via CRC
> > > > really.
> > >
> > > Right. It could be sha or anything else, but user space and kernel
> > > need to agree on the math to compute it, so something got to indicate
> > > that this 32-bit is a crc.
> > > Hence KIND_CRC, KIND_SHA fit better.
> >
> > what if instead of crc and base_src fields, we have
> >
> > __u32 id_str_off;
> > __u32 base_id_str_off;
> >
> > and they are offsets into a string section. We can then define that
> > those strings have to be something like "crc:<crc-value>" or
> > "sha:<sha-value". This will be a generic ID, and extensible (and more
> > easily extensible, probably), but won't require new KIND.
>
> Encoding binary data in strings with \0 and other escape chars?
> Ouch. Please no.
> We can have variable size KIND_ID and encode crc vs sha in flags,
> but binary data better stay binary.

It's just a string representation of a hex dump of a byte array (or
u32 in crc32 case)? Only one of [0123456789abcdef], that's all. Just
like we have Git SHA string representation.

We don't have variable-sized KINDs, unless you are proposing to use
vlen as "number of bytes" of ID payload. And another KIND is
automatically breaking backwards compat for all existing tools. For no
good reason. This whole metadata is completely optional right now for
anything that's using libbpf for BTF parsing. But adding KIND_ID makes
it not optional at all.

Not sure what new KIND gives us in this case. This ID (whether it's
"crc:abcdef" or just "661866dbea52bfac7420cd35d0e502d4ccc11bb6" or
whatever) can be used by all application as is for comparison, you
don't need to understand how it is generated at all.

>
> > This also has a good property that 0 means "no ID", which helps with
> > the base BTF case. Current "__u32 crc;" doesn't have this property and
> > requires a flag.
>
> imo this crc addition is a litmus test for this self-described format.
> If we cannot encode it as a new KIND* it means this self-described
> idea is broken.

We can, but this can be a straightforward and simple *opaque* string
ID, so the new kind just seems unnecessary.





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux