Hi Alan, On Thu, Feb 08, 2024 at 10:00:15AM +0000, Alan Maguire wrote: > On 04/02/2024 18:40, Daniel Xu wrote: [...] > > + > > + if (shdr.sh_type == SHT_SYMTAB) { > > + symbols_shndx = i; > > + symscn = scn; > > + symbols = data; > > + strtabidx = shdr.sh_link; > > + } else if (!strcmp(secname, BTF_IDS_SECTION)) { > > + idlist_shndx = i; > > + idlist_addr = shdr.sh_addr; > > + idlist = data; > > + } > > + } > > + > > Can we use the existing list of ELF functions collected via > btf_encoder__collect_symbols() for the above? We merge info across CUs > about ELF functions. It _seems_ like there might be a way to re-use this > info but I might be missing something; more below.. So the above two sections are only used to acquire information on the set8's. For collecting functions, this patch uses BTF (see btf_encoder__collect_btf_funcs()). > > > > + /* Cannot resolve symbol or .BTF_ids sections. Nothing to do. */ > > + if (symbols_shndx == -1 || idlist_shndx == -1) { > > + err = 0; > > + goto out; > > + } > > + > > + if (!gelf_getshdr(symscn, &shdr)) { > > + elf_error("Failed to get ELF symbol table header"); > > + goto out; > > + } > > + nr_syms = shdr.sh_size / shdr.sh_entsize; > > + > > + err = btf_encoder__collect_btf_funcs(encoder, &btf_funcs); > > + if (err) { > > + fprintf(stderr, "%s: failed to collect BTF funcs\n", __func__); > > + goto out; > > + } > > + > > + /* First collect all kfunc set ranges. > > + * > > + * Note we choose not to sort these ranges and accept a linear > > + * search when doing lookups. Reasoning is that the number of > > + * sets is ~O(100) and not worth the additional code to optimize. > > + */ > > + for (i = 0; i < nr_syms; i++) { > > + struct btf_kfunc_set_range range = {}; > > + const char *name; > > + GElf_Sym sym; > > + > > + if (!gelf_getsym(symbols, i, &sym)) { > > + elf_error("Failed to get ELF symbol(%d)", i); > > + goto out; > > + } > > + > > + if (sym.st_shndx != idlist_shndx) > > + continue; > > + > > + name = elf_strptr(elf, strtabidx, sym.st_name); > > + if (!is_sym_kfunc_set(&sym, name, idlist, idlist_addr)) > > + continue; > > + > > + range.start = sym.st_value; > > + range.end = sym.st_value + sym.st_size; > > we could potentially record this info when we collect symbols in > btf_encoder__collect_function(). The reason I suggest this is that it is > likely that to fully clarify which symbol a name refers to we will end > up needing the address. So struct elf_function could record start and > size, and that could be used by you later without having to parse ELF > for symbols (you'd still need to for the BTF ids section). This start/end pair is just for the set8's in .BTF_ids section. Which btf_encoder__collect_function() rightfully does not look at. So I think new code needs to be added for set8 collection anyways. I don't have any strong feelings about whether we should hook into btf_encoder__collect_symbols() or not -- IMHO it's a bit cleaner to have all the set8 stuff on its own codepath. What cases do you see needing the location + size for? Since kfuncs are exported, I would think it's not possible for a single object file to have multiple copies of the same kfunc. Nor that the compiler inline the kfunc. > > Then all you'd need to do is iterate over BTF functions, using > btf_encoder__find_function() to get a function and associated ELF info > by name. Didn't know about this, thanks. I'll take a look at if the patch can use the existing function metadata. That should get rid of btf_encoder__collect_btf_funcs() if it works. > > > > + gobuffer__add(&btf_kfunc_ranges, &range, sizeof(range)); > > + } > > + > > + /* Now inject BTF with kfunc decl tag for detected kfuncs */ > > + for (i = 0; i < nr_syms; i++) { > > + const struct btf_kfunc_set_range *ranges; > > + unsigned int ranges_cnt; > > + char *func, *name; > > + GElf_Sym sym; > > + bool found; > > + int err; > > + int j; > > + > > + if (!gelf_getsym(symbols, i, &sym)) { > > + elf_error("Failed to get ELF symbol(%d)", i); > > + goto out; > > + } > > + > > + if (sym.st_shndx != idlist_shndx) > > + continue; > > + > > + name = elf_strptr(elf, strtabidx, sym.st_name); > > + func = get_func_name(name); > > + if (!func) > > + continue; > > + > > + /* Check if function belongs to a kfunc set */ > > + ranges = gobuffer__entries(&btf_kfunc_ranges); > > + ranges_cnt = gobuffer__nr_entries(&btf_kfunc_ranges); > > + found = false; > > + for (j = 0; j < ranges_cnt; j++) { > > + size_t addr = sym.st_value; > > + > > + if (ranges[j].start <= addr && addr < ranges[j].end) { > > + found = true; > > + break; > > + } > > + } > > + if (!found) { > > + free(func); > > + continue; > > + } > > + > > + err = btf_encoder__tag_kfunc(encoder, &btf_funcs, func); > > + if (err) { > > + fprintf(stderr, "%s: failed to tag kfunc '%s'\n", __func__, func); > > + free(func); > > + goto out; > > + } > > + free(func); > > + } > > + > > + err = 0; > > +out: > > + __gobuffer__delete(&btf_funcs); > > + __gobuffer__delete(&btf_kfunc_ranges); > > + if (elf) > > + elf_end(elf); > > + if (fd != -1) > > + close(fd); > > + return err; > > +} > > + > > int btf_encoder__encode(struct btf_encoder *encoder) > > { > > int err; > > @@ -1367,6 +1718,14 @@ int btf_encoder__encode(struct btf_encoder *encoder) > > if (btf__type_cnt(encoder->btf) == 1) > > return 0; > > > > + /* Note vmlinux may already contain btf_decl_tag's for kfuncs. So > > + * take care to call this before btf_dedup(). > > + */ > > sorry another thing I should have thought of here; if the user has asked > to skip encoding declaration tags, we should respect that for this case > also. So you'll probably need to set > > encoder->skip_encoding_btf_decl_tag = > conf_load->skip_encoding_btf_decl_tag; > > ..earlier on, and bail here if encoder->skip_encoding_btf_decl_tag is > true. We'd need to be consistent in that if the user asks not to encode > declaration tags we don't do it for this case either. Yeah, good catch. Will do. [...] Thanks, Daniel