Re: [RFC bpf-next 2/5] libbpf: provide libbpf API to encode BTF kind information

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 29/11/2022 17:01, Andrii Nakryiko wrote:
> On Tue, Nov 29, 2022 at 5:51 AM Alan Maguire <alan.maguire@xxxxxxxxxx> wrote:
>>
<snip>>>> I really don't like this approach, even if *technically* it would
>>> work. But even if so, it would add quite a bunch of size to BTF just
>>> to self-describe it.
>>>
>>> Let's go again (and in more detail) over my alternative proposal I
>>> briefly described in another email thread.
>>>
>>> So, what I'm proposing is similar in spirit and solves all the same
>>> goals you have (and actually some more, I'll point this out below).
>>> The only downside is that we'll need to, again, teach kernel to
>>> understand this BTF format extension to allow kernel to use it (so we
>>> still will need an opt-in flag for pahole, unfortunately, but
>>> hopefully just this one time). That's pretty much the only downside.
>>> But it's more compact, simpler and more straightforward, more elegant
>>> (IMO), and it is easy for libbpf to sanitize it for old kernels.
>>>
>>> Ok, so it's pretty much completely described by these changes:
>>>
>>> --- a/include/uapi/linux/btf.h
>>> +++ b/include/uapi/linux/btf.h
>>> @@ -8,6 +8,21 @@
>>>  #define BTF_MAGIC      0xeB9F
>>>  #define BTF_VERSION    1
>>>
>>> +struct btf_kind_meta {
>>> +       /* extra flags, initially define just one:
>>> +        * 0x01 - required or optional (is it safe to skip if unknown)
>>> +        */
>>> +       __u16 flags;
>>> +       __u8 info_sz;
>>> +       __u8 elem_sz;
>>> +};
>>> +
>>> +struct btf_metadata {
>>> +       __u8 kind_meta_cnt;
>>> +       __u32 :0;
>>> +       struct btf_kind_meta[];
>>> +};
>>> +
>>>  struct btf_header {
>>>         __u16   magic;
>>>         __u8    version;
>>> @@ -19,6 +34,8 @@ struct btf_header {
>>>         __u32   type_len;       /* length of type section       */
>>>         __u32   str_off;        /* offset of string section     */
>>>         __u32   str_len;        /* length of string section     */
>>> +       __u32   meta_off;
>>> +       __u32   meta_len;
>>>  };
>>>
>>
>> Ok, if we're going this route though, let's try to think through any
>> other info we need to add so the format changes are a one-time thing.
>> We should add flags too. One current use-case would be the
>> "is this BTF standalone, or does it require base BTF?" [1]. Either using
>> an existing value in the header flags field, or using the space for a flags
>> field in  struct btf_metadata would probably make sense.
> 
> Yes, it's a good idea. But instead of a flag, I wonder if we should
> add some sort of "build ID" concept here, so that we can check
> validity of base BTF as expected by split BTF?
>

I think that would be valuable; it would be great to be able
to spot up-front an incompatibility between split and base
BTF. Are you thinking a hash over the type and string sections
or similar? Any such id shouldn't require actual BTF parsing
I think, since a simple validation could occur absent actual
parsing of the base BTF object. Would we maintain an id 
for base and split BTF, or just record the base id in split BTF
to validate the base? Not needing to recompute the base id
each time for module BTF generation seems like it would make 
it worthwhile to record the BTF id of the current object as well 
as the id of the base object it is built upon.

So something like

struct btf_metadata {
	__u32 id;
	__u32 base_id;
	__u8 kind_meta_cnt;
	__u32 :0;
	struct btf_kind_meta[];
};

...where a 0 base_id implies the object is a root/standalone BTF object?

 
>>
>> Do we have any other outstanding issues with BTF that would be eased
>> by some sort of up-front declaration? If we can at least tackle those
>> things at once, the pain will be somewhat less when updating the toolchain.
> 
> Base vs split BTF + some check whether base BTF is valid is the only
> thing that currently comes to mind.
>

The topic of multiple levels of split BTF has come up before, but I don't 
think that has any additional implications from a metadata perspective;
each level would specify the base_id of the level below.

>>
>>>
>>> So, we add meta_off/meta_len fields to btf_header, which, if non-zero,
>>> will point to a piece of metadata (4-byte aligned) that's described by
>>> struct btf_metadata.
>>>
>>> In btf_metadata, the first byte records the number of known BTF kinds,
>>> we have three more bytes for extra flags or counters for
>>> extensibility, they should be zeroed out right now.
>>>
>>
>> Right; see above for one flags use-case.
>>
>>> After these 4 bytes we have kind_meta_cnt struct btf_kind_meta
>>> entries, each 4-byte long. It's a 1-indexed array, where each entry
>>> corresponds to sequentially numbered BTF kinds. First two bytes are
>>> reserved for flags and stuff like that. Among those, I think the most
>>> useful right now would be the "optional flag". If set, it would mean
>>> that generally speaking it's safe to skip types of that kind without
>>> losing integrity of the data. So e.g., we could have used that for
>>> DECL_TAGS, or perhaps even for FUNCs, if we had this metadata back
>>> then, as these kinds are, generally speaking, not referenced from
>>> other types (not 100% for FUNCs, as we can have FUNC externs, but
>>> those came later). Anyways, for kernel needs we can say that optional
>>> kinds don't cause failure to validate BTF.
>>>
>>
>> This would definitely be useful; but are you saying here that
>> a struct with a reference to an unknown kind should fail BTF
>> validation (something like a struct with an enum64 member parsed by a
>> libbpf prior to enum64 support)? Not sure there's any alternative
>> for a case like that...
> 
> From the kernel validation point -- yes, probably. From generic
> tooling and libbpf-side -- perhaps not. I think kernel will always
> have to be pretty strict due to security reasons.
> 
> 
>>
>>> *But for security reasons we should make the kernel zero-out
>>> corresponding parts of type information, just to prevent injection of
>>> well-known data by malicious user*.
>>>
>>> Next, to the meat of the proposal. info_sz is size in bytes of an
>>> additional singular information (e.g., btf_array for ARRAY kind,
>>> 4-byte info for INT kind, etc) that goes after common 12-byte struct
>>> btf_type. It can be zero, of course. elem_sz is a size in bytes of
>>> each nested element (field info for STRUCT, arg info for FUNC_ARG,
>>> etc). Number of elements is defined by btf_vlen(t), which works for
>>> any kind, regardless if it's known or not. If elem_sz is zero, KIND
>>> can't have nested elements (and thus if vlen is non-zero, that's a
>>> corruption).
>>>
>>> That's it. We don't allow mixing differently-sized nested elements
>>> within a single kind, but we don't have that today and we don't have
>>> any meaningful ways to express this. And I don't think we'd want to do
>>> this anyways (there are way to work around that if absolutely
>>> necessary, as well).
>>>
>>> From libbpf's point of view, this metadata section is easy to
>>> sanitize, as kernel allows btf_headers of bigger size than is known to
>>> it, provided they are zeroed out. So libbpf will just zero out
>>> meta_off/meta_len fields, and contents of the metadata section.
>>>
>>> As for the size, it adds just 8 + 4 + 19 * 4 = 88 bytes to the overall
>>> BTF size. It's nothing. I didn't count the total size for your
>>> approach, but at the very least it would be 19 * 2 * sizeof(struct
>>> btf_type) (=12) = 456, but that's super conservative.
>>>
>>> Note also that each btf_type can always have a name (described by
>>> btf_type->name_off), so generic BTF tools can easily output what is
>>> the name of the skipped entity, regardless of its actual kind. Tools
>>> can also point out how many nested elements it is supposed to have.
>>> Both are quite nice features, IMO.
>>>
>>> Anyways, that's what I had in mind. I think we should bite a bullet
>>> and do it, so that future extensions can make use of this
>>> self-describing metadata.
>>>
>>> Thoughts?
>>>
>>
>> It'll work, a few specific questions we should probably resolve up front:
>>
>> - We can deduce the presence of the metadata info from the header length, so we
>>   don't need a BTF version bump, right?
> 
> yep
> 
>>
>> - from the encoding perspective, you mentioned having metadata opt-in;
>>   so I presume we'd have a btf__add_metadata() API (it is zero by default so
>>   accepted by the kernel I think) if --encode_metadata is set? Perhaps eventually
>>   we could move to opt-out.
> 
> I'd say that btf__new() should by default produce metadata, unless
> opted out through opts. But pahole should default for opt-out to not
> regress on old kernels built with new pahole.
> 

Ok; we'll need new APIs btf__new_empty[_split]_opts() to handle this I think.

Alan

>>
>> - there are some cases where what is valid has evolved over time. For example,
>>   kind flags have appeared for some kinds; should we have a flag for "supports kind
>>   flag"? (set for struct/union/enum/fwd/eum64)?
>>
> 
> "supports kind flag" seems way too specific, tbh. Seems wrong to have
> such a flag.
> 
> 
>> I can probably respin what I have, unless you want to take it on?
> 
> Let's discuss base vs split BTF identification first.
> 
>>
>> [1] https://lore.kernel.org/bpf/CAEf4BzYXRT9pFmC1RqnNBmvQWGQkd0zs9rbH9z9Ug8FWOArb_Q@xxxxxxxxxxxxxx/
>>
>>>
>>>> +
>>>> +/* info used to encode a kind metadata field */
>>>> +struct btf_meta_field {
>>>> +       const char *type;
>>>> +       const char *name;
>>>> +       int size;
>>>> +       int type_id;
>>>> +};
>>>> +
>>>> +#define BTF_MAX_META_FIELDS             10
>>>> +
>>>> +#define BTF_META_FIELD(__type, __name)                                 \
>>>> +       { .type = #__type, .name = #__name, .size = sizeof(__type) }
>>>> +
>>>> +#define BTF_KIND_STR(__kind)   #__kind
>>>> +
>>>> +struct btf_kind_encoding {
>>>> +       struct btf_kind_desc kind;
>>>> +       struct btf_meta_field meta[BTF_MAX_META_FIELDS];
>>>> +};
>>>> +
>>>> +#define BTF_KIND(__name, __nr_meta, __meta_size, ...)                  \
>>>> +       { .kind = {                                                     \
>>>> +         .kind = BTF_KIND_##__name,                                    \
>>>> +         .struct_name = BTF_KIND_PFX#__name,                           \
>>>> +         .meta_name = BTF_KIND_META_PFX #__name,                       \
>>>> +         .nr_meta = __nr_meta,                                         \
>>>> +         .meta_size = __meta_size,                                     \
>>>> +       }, .meta = { __VA_ARGS__ } }
>>>> +
>>>> +struct btf_kind_encoding kinds[] = {
>>>> +       BTF_KIND(UNKN,          0,      0),
>>>> +
>>>> +       BTF_KIND(INT,           0,      0),
>>>> +
>>>> +       BTF_KIND(PTR,           0,      0),
>>>> +
>>>> +       BTF_KIND(ARRAY,         1,      sizeof(struct btf_array),
>>>> +                                       BTF_META_FIELD(__u32, type),
>>>> +                                       BTF_META_FIELD(__u32, index_type),
>>>> +                                       BTF_META_FIELD(__u32, nelems)),
>>>> +
>>>> +       BTF_KIND(STRUCT,        0,      sizeof(struct btf_member),
>>>> +                                       BTF_META_FIELD(__u32, name_off),
>>>> +                                       BTF_META_FIELD(__u32, type),
>>>> +                                       BTF_META_FIELD(__u32, offset)),
>>>> +
>>>> +       BTF_KIND(UNION,         0,      sizeof(struct btf_member),
>>>> +                                       BTF_META_FIELD(__u32, name_off),
>>>> +                                       BTF_META_FIELD(__u32, type),
>>>> +                                       BTF_META_FIELD(__u32, offset)),
>>>> +
>>>> +       BTF_KIND(ENUM,          0,      sizeof(struct btf_enum),
>>>> +                                       BTF_META_FIELD(__u32, name_off),
>>>> +                                       BTF_META_FIELD(__s32, val)),
>>>> +
>>>> +       BTF_KIND(FWD,           0,      0),
>>>> +
>>>> +       BTF_KIND(TYPEDEF,       0,      0),
>>>> +
>>>> +       BTF_KIND(VOLATILE,      0,      0),
>>>> +
>>>> +       BTF_KIND(CONST,         0,      0),
>>>> +
>>>> +       BTF_KIND(RESTRICT,      0,      0),
>>>> +
>>>> +       BTF_KIND(FUNC,          0,      0),
>>>> +
>>>> +       BTF_KIND(FUNC_PROTO,    0,      sizeof(struct btf_param),
>>>> +                                       BTF_META_FIELD(__u32, name_off),
>>>> +                                       BTF_META_FIELD(__u32, type)),
>>>> +
>>>> +       BTF_KIND(VAR,           1,      sizeof(struct btf_var),
>>>> +                                       BTF_META_FIELD(__u32, linkage)),
>>>> +
>>>> +       BTF_KIND(DATASEC,       0,      sizeof(struct btf_var_secinfo),
>>>> +                                       BTF_META_FIELD(__u32, type),
>>>> +                                       BTF_META_FIELD(__u32, offset),
>>>> +                                       BTF_META_FIELD(__u32, size)),
>>>> +
>>>> +
>>>> +       BTF_KIND(FLOAT,         0,      0),
>>>> +
>>>> +       BTF_KIND(DECL_TAG,      1,      sizeof(struct btf_decl_tag),
>>>> +                                       BTF_META_FIELD(__s32, component_idx)),
>>>> +
>>>> +       BTF_KIND(TYPE_TAG,      0,      0),
>>>> +
>>>> +       BTF_KIND(ENUM64,        0,      sizeof(struct btf_enum64),
>>>> +                                       BTF_META_FIELD(__u32, name_off),
>>>> +                                       BTF_META_FIELD(__u32, val_lo32),
>>>> +                                       BTF_META_FIELD(__u32, val_hi32)),
>>>> +};
>>>> +
>>>> +/* Try to add representations of the kinds supported to BTF provided.  This will allow parsers
>>>> + * to decode kinds they do not support and skip over them.
>>>> + */
>>>> +int btf__add_kinds(struct btf *btf)
>>>> +{
>>>> +       int btf_type_id, __u32_id, __s32_id, struct_type_id;
>>>> +       char name[64];
>>>> +       int i;
>>>> +
>>>> +       /* should have base types; if not bootstrap them. */
>>>> +       __u32_id = btf__find_by_name(btf, "__u32");
>>>> +       if (__u32_id < 0) {
>>>> +               __s32 unsigned_int_id = btf__find_by_name(btf, "unsigned int");
>>>> +
>>>> +               if (unsigned_int_id < 0)
>>>> +                       unsigned_int_id = btf__add_int(btf, "unsigned int", 4, 0);
>>>> +               __u32_id = btf__add_typedef(btf, "__u32", unsigned_int_id);
>>>> +       }
>>>> +       __s32_id = btf__find_by_name(btf, "__s32");
>>>> +       if (__s32_id < 0) {
>>>> +               __s32 int_id = btf__find_by_name_kind(btf, "int", BTF_KIND_INT);
>>>> +
>>>> +               if (int_id < 0)
>>>> +                       int_id = btf__add_int(btf, "int", 4, BTF_INT_SIGNED);
>>>> +               __s32_id = btf__add_typedef(btf, "__s32", int_id);
>>>> +       }
>>>> +
>>>> +       /* add "struct __btf_type" if not already present. */
>>>> +       btf_type_id = btf__find_by_name(btf, "__btf_type");
>>>> +       if (btf_type_id < 0) {
>>>> +               __s32 union_id = btf__add_union(btf, NULL, sizeof(__u32));
>>>> +
>>>> +               btf__add_field(btf, "size", __u32_id, 0, 0);
>>>> +               btf__add_field(btf, "type", __u32_id, 0, 0);
>>>> +
>>>> +               btf_type_id = btf__add_struct(btf, "__btf_type", sizeof(struct btf_type));
>>>> +               btf__add_field(btf, "name_off", __u32_id, 0, 0);
>>>> +               btf__add_field(btf, "info", __u32_id, sizeof(__u32) * 8, 0);
>>>> +               btf__add_field(btf, NULL, union_id, sizeof(__u32) * 16, 0);
>>>> +       }
>>>> +
>>>> +       for (i = 0; i < ARRAY_SIZE(kinds); i++) {
>>>> +               struct btf_kind_encoding *kind = &kinds[i];
>>>> +               int meta_id, array_id = 0;
>>>> +
>>>> +               if (btf__find_by_name(btf, kind->kind.struct_name) > 0)
>>>> +                       continue;
>>>> +
>>>> +               if (kind->kind.meta_size != 0) {
>>>> +                       struct btf_meta_field *field;
>>>> +                       __u32 bit_offset = 0;
>>>> +                       int j;
>>>> +
>>>> +                       meta_id = btf__add_struct(btf, kind->kind.meta_name, kind->kind.meta_size);
>>>> +
>>>> +                       for (j = 0; bit_offset < kind->kind.meta_size * 8; j++) {
>>>> +                               field = &kind->meta[j];
>>>> +
>>>> +                               field->type_id = btf__find_by_name(btf, field->type);
>>>> +                               if (field->type_id < 0) {
>>>> +                                       pr_debug("cannot find type '%s' for kind '%s' field '%s'\n",
>>>> +                                                kind->meta[j].type, kind->kind.struct_name,
>>>> +                                                kind->meta[j].name);
>>>> +                               } else {
>>>> +                                       btf__add_field(btf, field->name, field->type_id, bit_offset, 0);
>>>> +                               }
>>>> +                               bit_offset += field->size * 8;
>>>> +                       }
>>>> +                       array_id = btf__add_array(btf, __u32_id, meta_id,
>>>> +                                                 kind->kind.nr_meta);
>>>> +
>>>> +               }
>>>> +               struct_type_id = btf__add_struct(btf, kind->kind.struct_name,
>>>> +                                                sizeof(struct btf_type) +
>>>> +                                                (kind->kind.nr_meta * kind->kind.meta_size));
>>>> +               btf__add_field(btf, "type", btf_type_id, 0, 0);
>>>> +               if (kind->kind.meta_size != 0)
>>>> +                       btf__add_field(btf, "meta", array_id, sizeof(struct btf_type) * 8, 0);
>>>> +               snprintf(name, sizeof(name), BTF_KIND_PFX "%u", i);
>>>> +               btf__add_typedef(btf, name, struct_type_id);
>>>> +       }
>>>> +       return 0;
>>>> +}
>>>> diff --git a/tools/lib/bpf/btf.h b/tools/lib/bpf/btf.h
>>>> index 8e6880d..a054082 100644
>>>> --- a/tools/lib/bpf/btf.h
>>>> +++ b/tools/lib/bpf/btf.h
>>>> @@ -219,6 +219,16 @@ LIBBPF_API int btf__add_datasec_var_info(struct btf *btf, int var_type_id,
>>>>  LIBBPF_API int btf__add_decl_tag(struct btf *btf, const char *value, int ref_type_id,
>>>>                             int component_idx);
>>>>
>>>> +/**
>>>> + * @brief **btf__add_kinds()** adds BTF representations of the kind encoding for
>>>> + * all of the kinds known to libbpf.  This ensures that when BTF is encoded, it
>>>> + * will include enough information for parsers to decode (and skip over) kinds
>>>> + * that the parser does not know about yet.  This ensures that an older BTF
>>>> + * parser can read newer BTF, and avoids the need for the BTF encoder to limit
>>>> + * which kinds it emits to make decoding easier.
>>>> + */
>>>> +LIBBPF_API int btf__add_kinds(struct btf *btf);
>>>> +
>>>>  struct btf_dedup_opts {
>>>>         size_t sz;
>>>>         /* optional .BTF.ext info to dedup along the main BTF info */
>>>> diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map
>>>> index 71bf569..6121ff1 100644
>>>> --- a/tools/lib/bpf/libbpf.map
>>>> +++ b/tools/lib/bpf/libbpf.map
>>>> @@ -375,6 +375,7 @@ LIBBPF_1.1.0 {
>>>>                 bpf_link_get_fd_by_id_opts;
>>>>                 bpf_map_get_fd_by_id_opts;
>>>>                 bpf_prog_get_fd_by_id_opts;
>>>> +               btf__add_kinds;
>>>>                 user_ring_buffer__discard;
>>>>                 user_ring_buffer__free;
>>>>                 user_ring_buffer__new;
>>>> --
>>>> 1.8.3.1
>>>>



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux