Hi Dave, On Thu, Dec 21, 2023 at 05:57:18PM -0500, David Marchevsky wrote: > > > On 12/20/23 5:19 PM, Daniel Xu wrote: > > This commit teaches pahole to parse symbols in .BTF_ids section in > > vmlinux and discover exported kfuncs. Pahole then takes the list of > > kfuncs and injects a BTF_KIND_DECL_TAG for each kfunc. > > > > This enables downstream users and tools to dynamically discover which > > kfuncs are available on a system by parsing vmlinux or module BTF, both > > available in /sys/kernel/btf. > > > > Example of encoding: > > > > $ bpftool btf dump file .tmp_vmlinux.btf | rg DECL_TAG | wc -l > > 388 > > > > $ bpftool btf dump file .tmp_vmlinux.btf | rg 68940 > > [68940] FUNC 'bpf_xdp_get_xfrm_state' type_id=68939 linkage=static > > [128124] DECL_TAG 'kfunc' type_id=68940 component_idx=-1 > > > > Signed-off-by: Daniel Xu <dxu@xxxxxxxxx> > > --- > > btf_encoder.c | 202 ++++++++++++++++++++++++++++++++++++++++++++++++++ > > 1 file changed, 202 insertions(+) > > > > diff --git a/btf_encoder.c b/btf_encoder.c > > index fd04008..2697214 100644 > > --- a/btf_encoder.c > > +++ b/btf_encoder.c > > @@ -34,6 +34,9 @@ > > #include <pthread.h> > > > > #define BTF_ENCODER_MAX_PROTO 512 > > +#define BTF_IDS_SECTION ".BTF_ids" > > +#define BTF_ID_FUNC_PFX "__BTF_ID__func__" > > +#define BTF_KFUNC_TYPE_TAG "kfunc" > > Can this be bpf_kfunc? Elaborated on elsewhere in this reply Yeah, that's better. Good idea. > > > > > /* state used to do later encoding of saved functions */ > > struct btf_encoder_state { > > @@ -1352,6 +1355,200 @@ out: > > return err; > > } > > > > +/* > > + * Parse BTF_ID symbol and return the kfunc name. > > + * > > + * Returns: > > + * Callee-owned string containing kfunc name if successful. > > nit: Caller-owned, not callee-owned Fixed, thanks. > > > + * NULL if !kfunc or on error. > > + */ > > +static char *get_kfunc_name(const char *sym) > > +{ > > + char *kfunc, *end; > > + > > + if (strncmp(sym, BTF_ID_FUNC_PFX, sizeof(BTF_ID_FUNC_PFX) - 1)) > > + return NULL; > > + > > + /* Strip prefix */ > > + kfunc = strdup(sym + sizeof(BTF_ID_FUNC_PFX) - 1); > > + > > + /* Strip suffix */ > > + end = strrchr(kfunc, '_'); > > + if (!end || *(end - 1) != '_') { > > + free(kfunc); > > + return NULL; > > + } > > + *(end - 1) = '\0'; > > + > > + return kfunc; > > +} > > + > > +static int btf_encoder__tag_kfunc(struct btf_encoder *encoder, const char *kfunc) > > +{ > > + int nr_types, type_id, err = -1; > > + struct btf *btf = encoder->btf; > > + > > + nr_types = btf__type_cnt(btf); > > + for (type_id = 1; type_id < nr_types; type_id++) { > > + const struct btf_type *type; > > + const char *name; > > + > > + type = btf__type_by_id(btf, type_id); > > + if (!type) { > > + fprintf(stderr, "%s: malformed BTF, can't resolve type for ID %d\n", > > + __func__, type_id); > > + goto out; > > + } > > + > > + if (!btf_is_func(type)) > > + continue; > > + > > + name = btf__name_by_offset(btf, type->name_off); > > + if (!name) { > > + fprintf(stderr, "%s: malformed BTF, can't resolve name for ID %d\n", > > + __func__, type_id); > > + goto out; > > + } > > + > > + if (strcmp(name, kfunc)) > > + continue; > > + > > + err = btf__add_decl_tag(btf, BTF_KFUNC_TYPE_TAG, type_id, -1); > > In an ideal world we'd just add this tag to __bpf_kfunc macro > definition, right? Then bpftool can generate fwd decls from generated > vmlinux w/o any pahole changes. But no gcc support for BTF tags, so need > to do this workaround. > > With that in mind, instead of unconditionally adding BTF_KFUNC_TYPE_TAG > to funcs in btf id sets, can this code only do so if there isn't an > existing BTF_KFUNC_TYPE_TAG pointing to it? It'd require another loop > over btf types to built set of already-tagged funcs, but would > future-proof this work. Alternatively, if existing btf__dedup call after > btf_encoder__tag_kfuncs will get rid of these extraneous "tagged types" > in the scenario where one already exists, then a comment here to that > effect would be appreciated. Yeah, I placed the call to btf_encoder__tag_kfuncs() right before the call to btf__dedup() in btf_encoder__encode() cuz I was noticing duplicates. After moving the call to the current location, I noticed the duplicates went away. I'll leave a comment to that effect. Thanks, Daniel