On 9/9/21 7:19 PM, Jose E. Marchesi wrote:
On 9/9/21 3:45 PM, Jose E. Marchesi wrote:
Hi Yonghong.
LLVM14 added support for a new C attribute ([1])
__attribute__((btf_tag("arbitrary_str")))
This attribute will be emitted to dwarf ([2]) and pahole
will convert it to BTF. Or for bpf target, this
attribute will be emitted to BTF directly ([3]).
The attribute is intended to provide additional
information for
- struct/union type or struct/union member
- static/global variables
- static/global function or function parameter.
This new attribute can be used to add attributes
to kernel codes, e.g., pre- or post- conditions,
allow/deny info, or any other info in which only
the kernel is interested. Such attributes will
be processed by clang frontend and emitted to
dwarf, converting to BTF by pahole. Ultimiately
the verifier can use these information for
verification purpose.
The new attribute can also be used for bpf
programs, e.g., tagging with __user attributes
for function parameters, specifying global
function preconditions, etc. Such information
may help verifier to detect user program
bugs.
After this series, pahole dwarf->btf converter
will be enhanced to support new llvm tag
for btf_tag attribute. With pahole support,
we will then try to add a few real use case,
e.g., __user/__rcu tagging, allow/deny list,
some kernel function precondition, etc,
in the kernel.
We are looking into implementing this in the GCC BPF port.
Hi, Jose, thanks for your reply. It would be great if the
btf_tag can be implemented in gcc.
Supporting the new C attribute in BPF programs as a target-specific
attribute, and the new BTF kind, is straightforward enough.
However, I am afraid it will be difficult to upstream to GCC support
for
a target-independent C attribute called `btf_tag' that emits a
LLVM-specific DWARF tag. Even if we proposed to use a GCC-specific
Are you concerned with the name? The btf_tag name cames from the
discussion in
https://lore.kernel.org/bpf/CAADnVQJa=b=hoMGU213wMxyZzycPEKjAPFArKNatbVe4FvzVUA@xxxxxxxxxxxxxx/
as llvm guys want this attribute to be explicitly referring to bpf echo
system because we didn't implement for C++, and we didn't try to
annotate everywhere. Since its main purpose is to eventually encode in
btf (for different architectures), so we settled with btf_tag instead of
bpf_tag.
But if you have suggestion to change to a different name which can
be acceptable by both gcc and llvm community, I am okay with that.
I think the name of the attribute is very fine when BTF is generated
directly, like when compiling BPF programs. My concern is that the
connection `btf_tag () -> DWARF -> kernel/pahole -> BTF' may be seen as
too indirect and application-specific (the kernel) for a general-purpose
compiler attribute.
For llvm, btf_tag implies implementation scope as it *only covers* btf
use cases. There are some other use cases which may utilize the same
IR/dwarf implementation, but they may use a flag to control or different
attribute. And this has been agreed upon with llvm community, so we
should be okay here.
DWARF tag like DW_TAG_GNU_annotation using the same number, or better a
compiler neutral tag like DW_TAG_annotation or DW_TAG_BPF_annotation,
adding such an attribute for all targets would still likely to be much
controversial...
This is okay too. If gcc settles with DW_TAG_GNU_annotation with a
different number (not conflict with existing other llvm tag numbers),
I think llvm can change to have the same name and number since we are
still in the release.
Thanks, that is very nice and appreciated :) I don't think the
particular number used to encode the tag matters much, provided it
doesn't collide with any existing one of course...
However, there may be a way to entirely avoid creating a new DWARF
tag... see below.
Would you be open to explore other, more generic, ways to convey
these
annotations to pahole, something that could be easily supported by GCC,
and potentially other C compilers?
Could you share your proposal in detail? I think some kind of difference
might be okay if it is handled by pahole and invisible to users,
although it would be good if pahole only needs to handle single
interface w.r.t. btf_tag support.
GCC can currently generate BTF for any target, not just BPF. For
example, you can compile an object foo.o with both DWARF and BTF with:
$ gcc -c -gdwarf -gbtf foo.c
or with just BTF with:
$ gcc -c -gbtf foo.c
Binutils (ld) also supports full type deduplication for CTF, which is
very similar to BTF. We use it to build kernels in-house with CTF
enabled for dtrace. It is certainly possible to add support to ld to
also merge and deduplicate BTF sections... it is similar enough to CTF
to (hopefully) not require much work, and we were already considering
doing it anyway for other purposes.
So the proposal would be:
For GCC, we can implement the btf_tag for any target, but impacting only
the BTF output as the name implies. No effect in DWARF. Once ld is
able to merge and deduplicate BTF, it shall then be possible to build
the kernel and obtain the BTF for it without the aid of pahole, just
building it with -gdwarf -gbtf and linking normally. (We know this works
with CTF.)
This should be okay.
For LLVM, nothing would have to be done in the short term: just use the
DWARF DIE + pahole approach already implemented. But in the medium term
LLVM could be made to 1) support emitting BTF for any target (not sure
how difficult would that be, maybe it already can do that?) and 2) to
support the -gbtf command-line option.
Then the generation of BTF for the kernel could be done in the same way
(same build rules) with both compilers, and there would be no need for
conveying the extra tags (nor any future extensions to aid the verifier
on the kernel side) to pahole via DWARF. Pure BTF all the way up (or
down) without diversion to DWARF :)
Does this make sense? WDYT?
During discussion to implement btf_tag attribute, I actually have a
prototype to emit BTF with non-BPF targets in llvm.
see https://reviews.llvm.org/D103549
But since we get a simpler solution to emit the info to llvm, so we went
there. We will keep this in mind, it is totally possible in the future
we may start to directly generate BTF from llvm for all architectures.