Re: BTF tag support in DWARF (notes for today's BPF Office Hours)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



We agreed in the meeting to implement Solution 2 below in both GCC and
clang.

The DW_TAG_LLVM_annotation DIE number will be changed in order to make
it possible for pahole to handle the current tags.  The number of the
new tag will be shared by both GCC and clang.

Thanks everyone for the feedback.

> Hello all.
>
> Find below the notes we intend to use in today's BPF office hour to
> discuss possible solutions for the current limitations in the DWARF
> representation of the btf_type_tag C attributes, and hopefully decide on
> one so we can move forward with this.
>
> The list of suggested solutions below is of course not closed: these are
> just the ones we could think about.  Better alternatives and suggestions
> are very welcome!
>
> BTF tag support in DWARF
>
> * Current situation: annotations as children DIEs for pointees
>
>   DWARF information is structured as a tree of DIE nodes.  Nodes can
>   have attributes associated to them, as well as zero or more DIE
>   children.
>    
>   clang extends DWARF with a new tag (DIE type) =DW_TAG_LLVM_annotation=.
>   Nodes of this type are used to associate a tag name with a tag value that
>   is also a string.
>
>   Example:
>
>   :  DW_TAG_LLVM_annotation
>   :     DW_AT_name        "btf_type_tag"
>   :     DW_AT_const_value "user"
>
>   At the moment, clang generates =DW_TAG_LLVM_annotation= nodes as children
>   of =DW_TAG_pointer_type= nodes.  The intended semantic is that the
>   annotation applies to the pointed-to type.
>
>   For example (indentation reflects the parent-children tree structure):
>
>   : DW_TAG_pointer_type
>   :   DW_AT_type "int"
>   :   DW_TAG_LLVM_annotation
>   :     DW_AT_name        "btf_type_tag"
>   :     DW_AT_const_value "tag1"
>
>   The example above associates a "btf_type_tag->tag1" named annotation to the
>   type pointed by its containing pointer_type, which is "int".
>
>   This approach has the advantage that, since the new
>   =DW_TAG_LLVM_annotation= nodes are effectively used as attributes, they are
>   safely ignored by DWARF consumers that do not understand this DIE type.
>
>   But this approach also has a big caveat: types that are not pointed-to by
>   pointer types are not expressible in this design.  This obviously impacts
>   simple types such as =int= but also pointer types that are not pointees
>   themselves.
>
>   For example, it is not possible to associate the tag =__tag2= to the type
>   =int **= in this example (Note this is sparse/clang ordering.):
>
>   : int * __tag1 * __tag2 h;
>
>   - sparse
>     +  __tag1 applies to int*, __tag2 applies to int**
>     : got int *[noderef] __tag1 *[addressable] [noderef] [toplevel] __tag2 h
>   - clang
>     + According to DWARF __tag1 applies to int*, no __tag2 (??).
>     + According to BTF  __tag1 applies to int*, no __tag2 (??).
>     : DWARF
>     : 0x00000023:   DW_TAG_variable
>     :                 DW_AT_name	("h")
>     :                 DW_AT_type	(0x0000002e "int **")
>     :
>     : 0x0000002e:   DW_TAG_pointer_type
>     :                 DW_AT_type	(0x00000037 "int *")
>     :
>     : 0x00000033:     DW_TAG_LLVM_annotation
>     :                 DW_AT_name	("btf_type_tag")
>     :                 DW_AT_const_value	("tag1")
>     : BTF
>     : [1] TYPE_TAG 'tag1' type_id=3
>     : [2] PTR '(anon)' type_id=1
>     : [3] PTR '(anon)' type_id=4
>     : [4] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED
>     : [5] VAR 'h' type_id=2, linkage=global
>     :
>     : 'h' -> ptr -> 'tag1' -> ptr -> int
>
> * A note about `void'
>
>   The DWARF specification recommends to denote the =void= C type by
>   generating a DIE with =DW_TAG_unspecified_type= and name "void".
>
>   However, both GCC and LLVM do _not_ follow this recommendation and instead
>   they denote the =void= type as the absence of a =DW_AT_type= attribute in
>   whatever containing node.
>
>   Example, for a pointer to =void=:
>
>   : 3      DW_TAG_pointer_type    [no children]
>
>   Note also that the kernel sources have sparse annotations like:
>
>   : void __user * data;
>
>   Which, using sparse ordering, means that the type which is annotated is
>   =void=.  Therefore it is very important to be able to tag the =void= basic
>   type in this design.
>
>   GDB and other DWARF consumers understand the spec-recommended way to denote
>   =void=.
>
> * Solution 1: annotations as qualifiers
>
>   A possible solution for this is to handle =DW_TAG_LLVM_annotation= the same
>   way than C type qualifiers are handled in DWARF: including them in the type
>   chain linked by =DW_AT_type= attributes.
>
>   For example:
>
>   : DW_TAG_pointer_type
>   :   DW_AT_type ("btf_type_tag")
>   :
>   : DW_TAG_LLVM_annotation
>   :   DW_AT_name        "btf_type_tag"
>   :   DW_AT_const_value "tag1"
>   :   DW_AT_type        ("int")
>   :
>   : DW_TAG_base_type
>   :   DW_AT_name ("int")
>
>   Note how now the =LLVM_annotation= has the annotated type linked by
>   =DW_AT_type=, and acts itself as a type linked from =DW_TAG_pointer_type=.
>
>   Advantages of this approach:
>
>   - It makes sense for annotations to be implemented as qualifiers, because
>     they actually qualify a target type.
>
>   - This approach is totally flexible and makes it possible to annotate any
>     type, qualified or not, pointed-to or not.
>
>   - The resulting DWARF looks like the BTF.
>
>   - It can handle annotated `void', as currently generated by GCC and
>     clang/LLVM:
>
>     :   DW_TAG_LLVM_annotation
>     :     DW_AT_name        "btf_type_tag"
>     :     DW_AT_const_value "tag1"
>     :     DW_AT_type NULL
>
>   Disadvantages of this approach:
>
>   - Implementing this is more elaborated, and it requires DWARF consumers to
>     understand this new DIE type, in order to follow the type chains in the
>     tree: =DW_TAG_LLVM_annotation= should now be expected in any =DW_AT_type=
>     reference.
>
>   - This breaks DWARF, making it very difficult to be implemented as a
>     compiler extension, and will likely require make it part of DWARF.
>
>   - This is not backwards compatible to what clang currently generates.
>
> * Solution 2: annotations as children DIEs
>
>   This approach involves keeping the =DW_TAG_LLVM_annotation= DIE, with the
>   same internal structure it has now, but associating it to the type DIE that
>   is its parent.  (Note this is not the same than being linked by a
>   =DW_AT_type= attribute like in Solution 1.)
>
>   This means that this DWARF tree:
>
>   : DW_TAG_pointer_type
>   :   DW_AT_type "int"
>   :   DW_TAG_LLVM_annotation
>   :     DW_AT_name        "btf_type_tag"
>   :     DW_AT_const_value "tag1"
>
>   Denotes an annotation that applies to the type =int*=, not the pointee type
>   =int=.
>
>   Advantages of this approach:
>
>   - This approach makes it possible to annotate any type, qualified or not,
>     pointed-to or not.
>
>   - This can easily be implemented as a compiler extension, because existing
>     DWARF consumers will happily ignore the new attributes in case they don't
>     support them;  the type chains in the tree remain the same.
>
>   - Easy to implement in GCC.
>
>   Disadvantages of this approach:
>
>   - This may result in an increased number of type nodes in the tree.  For
>     example, we may have a tagged =int*= and a non-tagged =int*=, which now
>     will have to be implemented using two different DIEs.
>    
>   - This is not backwards-compatible to what clang currently generates, in
>     the case of pointer types.
>
>   - It cannot handle annotated `void' as currently generated by GCC and
>     clang/LLVM, so for tagged =void= we would need to generate unspecified
>     types with name "void":
>
>     : DW_TAG_unspecified_type
>     :   DW_AT_name "void"
>     :   DW_TAG_LLVM_annotation
>     :     DW_AT_name        "btf_type_tag"
>     :     DW_AT_const_value "tag1"
>
>     But this should be supported by DWARF consumers, as per the DWARF spec,
>     and it is certainly recognized by GDB.
>
> * Solution 3a: annotations as set of attributes
>
>   Another possible solution is to extend DWARF with a pair of two new
>   attributes =DW_AT_annotation_tag= and =DW_AT_annotation_value=.
>
>   Annotated types will have these attributes defined.  Example:
>
>   : DW_TAG_pointer_type
>   :   DW_AT_type "int"
>   :   DW_AT_annotation_tag   "btf_type_tag"
>   :   DW_AT_annotation_value "tag1"
>
>   Note that in this example the tag applies to the pointer type, not the
>   pointee, i.e. to =int*=.
>
>   Advantages of this approach:
>
>   - This can easily be implemented as a compiler extension, because existing
>     DWARF consumers will happily ignore the new attributes in case they don't
>     support them;  the type chains in the tree remain the same.
>
>   - This is backwards compatible to what clang currently generates.
>
>   - Easy to implement in GCC.
>    
>   Disadvantages of this approach:
>
>   - This may result in an increased number of type nodes in the tree.  For
>     example, we may have a tagged =int*= and a non-tagged =int*=, which now
>     will have to be implemented using two different DIEs.
>
>   - It cannot handle annotated `void' as currently generated by GCC and
>     clang/LLVM, so for tagged =void= we would need to generate unspecified
>     types with name "void":
>
>     : DW_TAG_unspecified_type
>     :   DW_AT_name "void"
>     :   DW_AT_annotation_tag   "btf_type_tag"
>     :   DW_AT_annotation_value "tag1"
>
>     But this should be supported by DWARF consumers, as per the DWARF spec,
>     and it is certainly recognized by GDB.
>    
> * Solution 3b: annotations as single "structured" attributes
>
>   This is like 3a, but using a single attribute =DW_AT_annotation= instead of
>   two, and encoding the tag name and the tag value in the string value using
>   some convention.
>
>   For example:
>
>   : DW_TAG_pointer_type
>   :   DW_AT_type "int"
>   :   DW_AT_annotation "btf_type_tag tag1"
>
>   Meaning the tag name is "btf_type_tag" and the tag value is "tag1", using
>   the convention that a white character separates them.
>
>   Advantages over 3a:
>
>   - Using a single attribute is more robust, since it eliminates the possible
>     situation of a node having =DW_AT_annotation_tag= and not
>     =DW_AT_annotation_value=.
>
>   - It is easier to extend it, since the string stored in the
>     =DW_AT_annotation= attribute may be made as complex as desired.  Better
>     than adding more =DW_AT_annotation_FOO= attributes.
>
>   - This is backwards compatible to what clang currently generates.
>
>   - Easy to implement in GCC.
>    
>   Disadvantages over 3a:
>
>   - This requires defining conventions specifying the structure of the string
>     stored in the attribute.
>
>   - This has the danger of overzealous design: "let's store a JSON tree in
>     =DW_AT_annotation= for future extensions instead of continue bothering
>     with DWARF".
>
>   - It cannot handle annotated `void' as currently generated by GCC and
>     clang/LLVM, so for tagged =void= we would need to generate unspecified
>     types with name "void":
>
>     : DW_TAG_unspecified_type
>     :   DW_AT_name "void"
>     :   DW_AT_annotation  "btf_type_tag tag1"
>
>     But this should be supported by DWARF consumers, as per the DWARF spec,
>     and it is certainly recognized by GDB.



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux