Re: [PATCHv2 bpf-next 06/24] libbpf: Add elf symbol iterator

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jun 22, 2023 at 05:31:58PM -0700, Andrii Nakryiko wrote:
> On Tue, Jun 20, 2023 at 1:36 AM Jiri Olsa <jolsa@xxxxxxxxxx> wrote:
> >
> > Adding elf symbol iterator object (and some functions) that follow
> > open-coded iterator pattern and some functions to ease up iterating
> > elf object symbols.
> >
> > The idea is to iterate single symbol section with:
> >
> >   struct elf_symbol_iter iter;
> >   struct elf_symbol *sym;
> >
> >   if (elf_symbol_iter_new(&iter, elf, binary_path, SHT_DYNSYM))
> >         goto error;
> >
> >   while ((sym = elf_symbol_iter_next(&iter))) {
> >         ...
> >   }
> >
> > I considered opening the elf inside the iterator and iterate all symbol
> > sections, but then it gets more complicated wrt user checks for when
> > the next section is processed.
> >
> > Plus side is the we don't need 'exit' function, because caller/user is
> > in charge of that.
> >
> > The returned iterated symbol object from elf_symbol_iter_next function
> > is placed inside the struct elf_symbol_iter, so no extra allocation or
> > argument is needed.
> >
> > Suggested-by: Andrii Nakryiko <andrii@xxxxxxxxxx>
> > Signed-off-by: Jiri Olsa <jolsa@xxxxxxxxxx>
> > ---
> >  tools/lib/bpf/libbpf.c | 179 ++++++++++++++++++++++++++---------------
> >  1 file changed, 114 insertions(+), 65 deletions(-)
> >
> 
> This is great. Left a few nits below. I'm thinkin maybe we should add
> a separate elf.c file for all these ELF-related helpers and start
> offloading code from libbpf.c, which got pretty big already. WDYT?

yes, I thought doing the move after this is merged might be better,
because it's quite big already

> 
> 
> > diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> > index af52188daa80..cdac368c7ce1 100644
> > --- a/tools/lib/bpf/libbpf.c
> > +++ b/tools/lib/bpf/libbpf.c
> > @@ -10824,6 +10824,109 @@ static Elf_Scn *elf_find_next_scn_by_type(Elf *elf, int sh_type, Elf_Scn *scn)
> >         return NULL;
> >  }
> >
> > +struct elf_symbol {
> > +       const char *name;
> > +       unsigned long offset;
> > +       int bind;
> > +};
> > +
> > +struct elf_symbol_iter {
> 
> naming nits: elf_sym and elf_sym_iter? keep it short, keep it cool :)

ok

> 
> > +       Elf *elf;
> > +       Elf_Data *symbols;
> 
> syms :-P

ook ;-)

> 
> > +       size_t nr_syms;
> > +       size_t strtabidx;
> > +       size_t idx;
> 
> next_sym_idx?

ok

> 
> > +       struct elf_symbol sym;
> > +};
> > +
> > +static int elf_symbol_iter_new(struct elf_symbol_iter *iter,
> > +                              Elf *elf, const char *binary_path,
> > +                              int sh_type)
> > +{
> > +       Elf_Scn *scn = NULL;
> > +       GElf_Ehdr ehdr;
> > +       GElf_Shdr sh;
> > +
> > +       memset(iter, 0, sizeof(*iter));
> > +
> > +       if (!gelf_getehdr(elf, &ehdr)) {
> > +               pr_warn("elf: failed to get ehdr from %s: %s\n", binary_path, elf_errmsg(-1));
> > +               return -LIBBPF_ERRNO__FORMAT;
> > +       }
> > +
> > +       scn = elf_find_next_scn_by_type(elf, sh_type, NULL);
> > +       if (!scn) {
> > +               pr_debug("elf: failed to find symbol table ELF sections in '%s'\n",
> > +                        binary_path);
> > +               return -EINVAL;
> > +       }
> > +
> > +       if (!gelf_getshdr(scn, &sh))
> > +               return -EINVAL;
> > +
> > +       iter->strtabidx = sh.sh_link;
> > +       iter->symbols = elf_getdata(scn, 0);
> > +       if (!iter->symbols) {
> > +               pr_warn("elf: failed to get symbols for symtab section in '%s': %s\n",
> > +                       binary_path, elf_errmsg(-1));
> > +               return -LIBBPF_ERRNO__FORMAT;
> > +       }
> > +       iter->nr_syms = iter->symbols->d_size / sh.sh_entsize;
> > +       iter->elf = elf;
> > +       return 0;
> > +}
> > +
> > +static struct elf_symbol *elf_symbol_iter_next(struct elf_symbol_iter *iter)
> > +{
> > +       struct elf_symbol *ret = &iter->sym;
> > +       unsigned long offset = 0;
> > +       const char *name = NULL;
> > +       GElf_Shdr sym_sh;
> > +       Elf_Scn *sym_scn;
> > +       GElf_Sym sym;
> > +       size_t idx;
> > +
> > +       for (idx = iter->idx; idx < iter->nr_syms; idx++) {
> > +               if (!gelf_getsym(iter->symbols, idx, &sym))
> > +                       continue;
> > +               if (GELF_ST_TYPE(sym.st_info) != STT_FUNC)
> > +                       continue;
> 
> it would be more generic if this symbol type filter was a parameter to
> iterator, instead of hard-coding it?

ok

> 
> > +               name = elf_strptr(iter->elf, iter->strtabidx, sym.st_name);
> > +               if (!name)
> > +                       continue;
> > +
> > +               /* Transform symbol's virtual address (absolute for
> > +                * binaries and relative for shared libs) into file
> > +                * offset, which is what kernel is expecting for
> > +                * uprobe/uretprobe attachment.
> > +                * See Documentation/trace/uprobetracer.rst for more
> > +                * details.
> > +                * This is done by looking up symbol's containing
> > +                * section's header and using iter's virtual address
> > +                * (sh_addr) and corresponding file offset (sh_offset)
> > +                * to transform sym.st_value (virtual address) into
> > +                * desired final file offset.
> > +                */
> > +               sym_scn = elf_getscn(iter->elf, sym.st_shndx);
> > +               if (!sym_scn)
> > +                       continue;
> > +               if (!gelf_getshdr(sym_scn, &sym_sh))
> > +                       continue;
> > +
> > +               offset = sym.st_value - sym_sh.sh_addr + sym_sh.sh_offset;
> 
> I think this part is not really generic "let's iterate ELF symbols",
> maybe let users of iterator do this? We can have a helper to do
> translation if we need to do it in few different places.

yes this will be called in all the places we use the iterator,
I'll add the helper for it

> 
> > +               break;
> > +       }
> > +
> > +       /* we reached the last symbol */
> > +       if (idx == iter->nr_syms)
> > +               return NULL;
> > +       iter->idx = idx + 1;
> > +       ret->name = name;
> > +       ret->bind = GELF_ST_BIND(sym.st_info);
> > +       ret->offset = offset;
> 
> Why not just return entire GElf_Sym information and let user process
> it as desired. So basically for each symbol you'll give back its name,
> GElf_Sym info, and I'd return symbol index as well. That will keep
> this very generic for future uses.

ok, so you have other users of this iterator in mind already?

> 
> > +       return ret;
> 
> I'd structure this a bit different. If we got out of loop, just return
> NULL. Then inside the for loop, when we found the symbol, fill out ret
> and return from inside the for loop. I think it's more
> straightforward.

ok, will change

thanks,
jirka

> 
> > +}
> > +
> >  /* Find offset of function name in the provided ELF object. "binary_path" is
> >   * the path to the ELF binary represented by "elf", and only used for error
> >   * reporting matters. "name" matches symbol name or name@@LIB for library
> 
> [...]




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux