In recent discussion in BPF mailing list ([1], look for Solution #2) participants agreed to add a new DWARF representation for "btf_type_tag" annotations. Existing representation is DW_TAG_LLVM_annotation object attached as a child to a DW_TAG_pointer_type. It means that "btf_type_tag" annotation is attached to a pointee type. New representation is DW_TAG_LLVM_annotation object attached as a child to *any* type. It means that "btf_type_tag" annotation is attached to the parent type. For example, for the following C code: int __attribute__((btf_type_tag("tag1"))) *g; CLANG generates the following DWARF (1): 0x0000001e: DW_TAG_variable DW_AT_name ("g") DW_AT_type (0x00000029 "int *") 0x00000029: DW_TAG_pointer_type DW_AT_type (0x00000032 "int") 0x0000002e: DW_TAG_LLVM_annotation DW_AT_name ("btf_type_tag") DW_AT_const_value ("tag1") 0x00000032: DW_TAG_base_type DW_AT_name ("int") However, using the new representation scheme DWARF looks as follows (2): 0x0000001e: DW_TAG_variable DW_AT_name ("g") DW_AT_type (0x00000029 "int *") 0x00000029: DW_TAG_pointer_type DW_AT_type (0x00000032 "int") 0x00000032: DW_TAG_base_type DW_AT_name ("int") DW_AT_encoding (DW_ATE_signed) DW_AT_byte_size (0x04) 0x00000036: DW_TAG_LLVM_annotation DW_AT_name ("btf:type_tag") DW_AT_const_value ("tag1") Note that in (1) DW_TAG_LLVM_annotation is a child of DW_TAG_pointer_type, but in (2) it is a child of DW_TAG_base_type. This patch adds logic necessary to handle such annotations in the pahole tool. Examples like below are supported: #define __tag(val) __attribute__((btf_type_tag("__" #val))) struct alpha {}; union bravo {}; enum charlie { X }; typedef int delta; struct echo { int * __tag(a) a; int __tag(b) *b; int __tag(c) c; void __tag(d) *d; void __tag(e) *e; struct alpha __tag(f) f; union bravo __tag(g) g; enum charlie __tag(h) h; delta __tag(i) i; int __tag(j_result) (__tag(j) *j)(int __tag(j_param)); } g; Implementation details ---------------------- Although this was not discussed in the mailing list, the proposed implementation acts in a following way (for compatibility reasons): - both forms could be present in the debug info; - if any annotations corresponding to the new form are present in the debug info, annotations corresponding to the old form are ignored. The v3 of this patch-set includes changes suggested in discussion [2], which simplifies the implementation. Here is an overview for each patch in the patch-set: 1. "dwarves.h: expose ptr_table interface" Makes struct ptr_table related functions accessible from dwarves.h header. 2. "dwarf_loader: Track unspecified types in a separate list" [1] suggests that new type tags encoding for the following case: void __attribute__((btf_type_tag("tag1"))) *g; Creates a DIE of kind DW_TAG_unspecified_type with name "void" and attaches DW_TAG_LLVM_annotation children to it. The later patches would rely on identity of this unspecified type instance for recoding, thus this patch introduces special tag type to represent DW_TAG_unspecified_type DIEs in the object model. 3. "dwarf_loader: handle btf_type_tag w/o special pointer type" Changes the way type tags are encoded in the dwarves.h object model: - special pointer type btf_type_tag_ptr_type is removed; - field struct btf_type_tag_type::node is removed, instances of btf_type_tag_type no longer form a list, instead links between types are tracked via btf_type_tag_type::tag.type fields, as it is done for derived types (e.g. DW_TAG_const_type). 4. "dwarf_loader: support btf:type_tag DW_TAG_LLVM_annotation" Adds support for new type tags encoding, this includes: - Changes to visit child DIEs of the following types: - DW_TAG_unspecified_type - DW_TAG_base_type - DW_TAG_typedef - DW_TAG_array_type - DW_TAG_subroutine_type - DW_TAG_enumeration_type - DW_TAG_structure_type In order to collect DW_TAG_LLVM_annotation's. - Changes in recode phase. 5. "dwarf_loader: move type tags before CVR qualifiers when necessary" Kernel expects type tags to precede CVR qualifiers in BTF. However, DWARF encoding format agreed with GCC team in [3.2] does not allow to attach DW_TAG_LLVM_annotation tags to qualifiers. Hence, this patch, which adds a post-processing step that converts type chains like CONST -> VOLATILE -> TYPE_TAG -> ... to TYPE_TAG -> CONST -> VOLATILE -> ... 6. "btf_encoder: skip type tags for VAR entry types" Kernel does not expect VAR entries to have types starting from BTF_TYPE_TAG. Before introduction of support for 'btf:type_tag' such situations were not possible, as TYPE_TAG entries were always preceded by PTR entries. This patch changes BTF VAR generation code to skip any BTF_TYPE_TAG entries for VAR type. Corresponding CLANG changes are tracked in [3], refer to [3.2] for some encoding examples. Testing ------- To verify the changes I used the following: - Tools: - "LLVM-main" :: LLVM at revision [4]; - "LLVM-new" :: LLVM at revision [4] with patches [3] applied; - "gcc" :: GCC version 11.3 (no support for btf_type_tag annotations); - "pahole-next" :: dwarves at revision [5]; - "pahole-new" :: dwarves at revision [5] + this patch-set, - "kernel" :: Linux Kernel bpf-next branch at revision [6] with CI patch [7]. - test cases: - kernel build; - kernel BPF test cases build, BPF tests execution (test_verifier, test_progs, test_progs-no_alu32, test_maps); - btfdiff script (suggested by Arnaldo, [8]). - tool combinations (kernel compiler / clang for BPF tests / pahole version): - LLVM-main / LLVM-main / pahole-new - kernel build : ok - bpf tests : ok - btfdiff : ok (modulo diff #1, see below) - gcc / LLVM-main / pahole-new - kernel build : ok - bpf tests : ok - btfdiff : ok but dwarf dump sometimes segfaults - LLVM-new / LLVM-new / pahole-next - kernel build : ok (modulo warn #1, see below) - bpf tests : ok - btfdiff : ok (modulo diff #1, see below) - LLVM-new / LLVM-new / pahole-new - kernel build : ok - bpf tests : ok - btfdiff : ok (modulo diff #1, see below) - gcc / LLVM-new / pahole-new - kernel build : ok - bpf tests : ok - btfdiff : ok Diff #1: Difference in flexible array printing, several occurrences as below: @ -10531,7 +10531,7 @ struct bpf_cand_cache { struct { const struct btf * btf; /* 16 8 */ u32 id; /* 24 4 */ - } cands[0]; /* 16 0 */ + } cands[]; /* 16 0 */ Warn #1: pahole-next complains about unexpected child tags generated by clang, e.g.: die__create_new_tag: unspecified_type WITH children! die__create_new_base_type: DW_TAG_base_type WITH children! Changelog --------- V2 -> V3: - Suggestion [2] from Arnaldo to represent type tags as separate derived types is applied. As a consequence, V3 rewrites V2 almost completely. - "dwarf_loader: move type tags before CVR qualifiers when necessary" is added after discussion in [3.2]. - "btf_encoder: skip type tags for VAR entry types" is added after additional testing (I'm not sure why this was not an issue for V2). V1 -> V2: - The patch is split in 5 parts to (hopefully) simplify the review: - #1, #2: two simple patches for fprintf and btf_loader to fix printing issue for types annotated by BTF type tags; - #3: merges `struct llvm_annotation` and `struct btf_type_tag_type` as a preparatory step; - #4: introduces `struct unspecified_type` as a preparatory step; - #5: main logic for `btf:type_tag` support, this once can't be split further w/o parts losing some functionality for kernel build and/or bpf tests. - `reallocarray()` in `push_btf_type_tag_mapping()` is replaced by `realloc()` (suggested by Alan); - The sequence `free(dcu->hash_tags); free(dcu->hash_types);` added in V1 is removed from `dwarf_cu__delete()`. It was a fix for some valgrind errors reported for `pahole -F dwarf`, but this is unrelated and the fix is incomplete. Links & revisions ----------------- [1] Mailing list discussion regarding `btf:type_tag` Various approaches are discussed, Solution #2 is accepted https://lore.kernel.org/bpf/87r0w9jjoq.fsf@xxxxxxxxxx/ [2] Suggestion to treat DW_TAG_llvm_annotation as a derived tag https://lore.kernel.org/bpf/ZCVygOn0+zKFEqW2@xxxxxxxxxx/ [3] LLVM changes to generate btf:type_tag, revisions stack: [3.1] https://reviews.llvm.org/D143966 [3.2] https://reviews.llvm.org/D143967 - this one has a number of examples in the description. [3.3] https://reviews.llvm.org/D145891 [4] LLVM revision commit ec77d1f3d9fc ("[lldb] Simplify predicates of find_if in BroadcastManager") [5] Dwarves revision: commit 31bc0d741057 ("dwarf_loader: DW_TAG_subroutine_type may have a DW_AT_byte_size") [6] Kernel revision: commit df21139441b0 ("tracing: fprobe: Initialize ret valiable to fix smatch error") [7] Kernel CI patch: https://github.com/kernel-patches/vmtest/commit/2d732ac4e06631d11f4326989eea28d695efb7f5 [8] Suggestion to use btfdiff https://lore.kernel.org/dwarves/ZAKpZGSHTvsS4r8E@xxxxxxxxxx/T/#mddbfe661e339485fb2b0e706b313 [V1] https://lore.kernel.org/dwarves/2232e368e55eb401bde45ce1b20fb710e379ae9c.camel@xxxxxxxxx/T/ [V2] https://lore.kernel.org/dwarves/20230314230417.1507266-1-eddyz87@xxxxxxxxx/ Eduard Zingerman (6): dwarves.h: expose ptr_table interface dwarf_loader: Track unspecified types in a separate list dwarf_loader: handle btf_type_tag w/o special pointer type dwarf_loader: support btf:type_tag DW_TAG_LLVM_annotation dwarf_loader: move type tags before CVR qualifiers when necessary btf_encoder: skip type tags for VAR entry types btf_encoder.c | 30 +- dwarf_loader.c | 731 ++++++++++++++++++++++++++++++++++++++++--------- dwarves.c | 7 +- dwarves.h | 34 +-- 4 files changed, 655 insertions(+), 147 deletions(-) -- 2.40.1