On Fri, Sep 15, 2023 at 12:30 AM Hengqi Chen <hengqi.chen@xxxxxxxxx> wrote: > > On Fri, Sep 15, 2023 at 1:12 AM Andrii Nakryiko > <andrii.nakryiko@xxxxxxxxx> wrote: > > > > On Thu, Sep 14, 2023 at 5:37 AM Hengqi Chen <hengqi.chen@xxxxxxxxx> wrote: > > > > > > On Wed, Sep 13, 2023 at 7:14 AM Andrii Nakryiko > > > <andrii.nakryiko@xxxxxxxxx> wrote: > > > > > > > > On Sun, Sep 10, 2023 at 6:51 PM Hengqi Chen <hengqi.chen@xxxxxxxxx> wrote: > > > > > > > > > > In current implementation, we assume that symbol found in .dynsym section > > > > > would have a version suffix and use it to compare with symbol user supplied. > > > > > According to the spec ([0]), this assumption is incorrect, the version info > > > > > of dynamic symbols are stored in .gnu.version and .gnu.version_d sections > > > > > of ELF objects. For example: > > > > > > > > > > $ nm -D /lib/x86_64-linux-gnu/libc.so.6 | grep rwlock_wrlock > > > > > 000000000009b1a0 T __pthread_rwlock_wrlock@GLIBC_2.2.5 > > > > > 000000000009b1a0 T pthread_rwlock_wrlock@@GLIBC_2.34 > > > > > 000000000009b1a0 T pthread_rwlock_wrlock@GLIBC_2.2.5 > > > > > > > > > > $ readelf -W --dyn-syms /lib/x86_64-linux-gnu/libc.so.6 | grep rwlock_wrlock > > > > > 706: 000000000009b1a0 878 FUNC GLOBAL DEFAULT 15 __pthread_rwlock_wrlock@GLIBC_2.2.5 > > > > > 2568: 000000000009b1a0 878 FUNC GLOBAL DEFAULT 15 pthread_rwlock_wrlock@@GLIBC_2.34 > > > > > 2571: 000000000009b1a0 878 FUNC GLOBAL DEFAULT 15 pthread_rwlock_wrlock@GLIBC_2.2.5 > > > > > > > > > > In this case, specify pthread_rwlock_wrlock@@GLIBC_2.34 or > > > > > pthread_rwlock_wrlock@GLIBC_2.2.5 in bpf_uprobe_opts::func_name won't work. > > > > > Because the qualified name does NOT match `pthread_rwlock_wrlock` (without > > > > > version suffix) in .dynsym sections. > > > > > > > > > > This commit implements the symbol versioning for dynsym and allows user to > > > > > specify symbol in the following forms: > > > > > - func > > > > > - func@LIB_VERSION > > > > > - func@@LIB_VERSION > > > > > > > > > > In case of symbol conflicts, error out and users should resolve it by > > > > > specifying a qualified name. > > > > > > > > > > [0]: https://refspecs.linuxfoundation.org/LSB_5.0.0/LSB-Core-generic/LSB-Core-generic/symversion.html > > > > > > > > > > Reviewed-by: Alan Maguire <alan.maguire@xxxxxxxxxx> > > > > > Acked-by: Jiri Olsa <jolsa@xxxxxxxxxx> > > > > > Signed-off-by: Hengqi Chen <hengqi.chen@xxxxxxxxx> > > > > > --- > > > > > tools/lib/bpf/elf.c | 146 +++++++++++++++++++++++++++++++++++++---- > > > > > tools/lib/bpf/libbpf.c | 2 +- > > > > > 2 files changed, 134 insertions(+), 14 deletions(-) > > > > > > > > > > diff --git a/tools/lib/bpf/elf.c b/tools/lib/bpf/elf.c > > > > > index 5c9e588b17da..825da903a34c 100644 > > > > > --- a/tools/lib/bpf/elf.c > > > > > +++ b/tools/lib/bpf/elf.c > > > > > @@ -1,5 +1,8 @@ > > > > > // SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) > > > > > > > > > > +#ifndef _GNU_SOURCE > > > > > +#define _GNU_SOURCE > > > > > +#endif > > > > > #include <libelf.h> > > > > > #include <gelf.h> > > > > > #include <fcntl.h> > > > > > @@ -10,6 +13,17 @@ > > > > > > > > > > #define STRERR_BUFSIZE 128 > > > > > > > > > > +/* A SHT_GNU_versym section holds 16-bit words. This bit is set if > > > > > + * the symbol is hidden and can only be seen when referenced using an > > > > > + * explicit version number. This is a GNU extension. > > > > > + */ > > > > > +#define VERSYM_HIDDEN 0x8000 > > > > > + > > > > > +/* This is the mask for the rest of the data in a word read from a > > > > > + * SHT_GNU_versym section. > > > > > + */ > > > > > +#define VERSYM_VERSION 0x7fff > > > > > + > > > > > int elf_open(const char *binary_path, struct elf_fd *elf_fd) > > > > > { > > > > > char errmsg[STRERR_BUFSIZE]; > > > > > @@ -64,11 +78,14 @@ struct elf_sym { > > > > > const char *name; > > > > > GElf_Sym sym; > > > > > GElf_Shdr sh; > > > > > + int ver; > > > > > + bool hidden; > > > > > }; > > > > > > > > > > struct elf_sym_iter { > > > > > Elf *elf; > > > > > Elf_Data *syms; > > > > > + Elf_Data *versyms; > > > > > size_t nr_syms; > > > > > size_t strtabidx; > > > > > size_t next_sym_idx; > > > > > @@ -111,6 +128,18 @@ static int elf_sym_iter_new(struct elf_sym_iter *iter, > > > > > iter->nr_syms = iter->syms->d_size / sh.sh_entsize; > > > > > iter->elf = elf; > > > > > iter->st_type = st_type; > > > > > + > > > > > + /* Version symbol table is meaningful to dynsym only */ > > > > > + if (sh_type != SHT_DYNSYM) > > > > > + return 0; > > > > > + > > > > > + scn = elf_find_next_scn_by_type(elf, SHT_GNU_versym, NULL); > > > > > + if (!scn) > > > > > + return 0; > > > > > + if (!gelf_getshdr(scn, &sh)) > > > > > + return -EINVAL; > > > > > + iter->versyms = elf_getdata(scn, 0); > > > > > + > > > > > return 0; > > > > > } > > > > > > > > > > @@ -119,6 +148,7 @@ static struct elf_sym *elf_sym_iter_next(struct elf_sym_iter *iter) > > > > > struct elf_sym *ret = &iter->sym; > > > > > GElf_Sym *sym = &ret->sym; > > > > > const char *name = NULL; > > > > > + GElf_Versym versym; > > > > > Elf_Scn *sym_scn; > > > > > size_t idx; > > > > > > > > > > @@ -138,12 +168,113 @@ static struct elf_sym *elf_sym_iter_next(struct elf_sym_iter *iter) > > > > > > > > > > iter->next_sym_idx = idx + 1; > > > > > ret->name = name; > > > > > + ret->ver = 0; > > > > > + ret->hidden = false; > > > > > + > > > > > + if (iter->versyms) { > > > > > + if (!gelf_getversym(iter->versyms, idx, &versym)) > > > > > + continue; > > > > > + ret->ver = versym & VERSYM_VERSION; > > > > > + ret->hidden = versym & VERSYM_HIDDEN; > > > > > + } > > > > > return ret; > > > > > } > > > > > > > > > > return NULL; > > > > > } > > > > > > > > > > +static const char *elf_get_vername(Elf *elf, int ver) > > > > > +{ > > > > > + GElf_Verdaux verdaux; > > > > > + GElf_Verdef verdef; > > > > > + Elf_Data *verdefs; > > > > > + size_t strtabidx; > > > > > + GElf_Shdr sh; > > > > > + Elf_Scn *scn; > > > > > + int offset; > > > > > + > > > > > + scn = elf_find_next_scn_by_type(elf, SHT_GNU_verdef, NULL); > > > > > > > > so this is a linear search, right? And we'll be doing it for every > > > > .dynsym symbol. Let's do this once at the creation time and remember a > > > > pointer inside struct Elf? > > > > > > > > > > We reach here only when the symbol part match, and likely we get the > > > desired one. > > > > sure, but you can have multiple versions, so multiple hits of this. > > And the ELF itself could be pretty big with lots of sections. So I > > think we should try to minimize number of linear searches over ELF > > sections, if possible. > > > > > if we store the pointers in struct Elf, then we have to involve > > > dynamic allocations. > > > > I'm not sure why dynamic allocations are needed? > > > > Your last comment says remember those version name pointer in struct Elf, > I thought this would be allocate an array and save those pointers > (i.e. store a mapping from version index to version name) > > > But also I had struct elf_sym_iter in mind, where we already cache > > Elf_Data *versyms, so why not do that with verdefs as well? I think > > verdef is not per symbol. I meant to just not do elf_find_next_scn_by_type() search every time, and just store Elf_Scn * pointer for SHT_GNU_verdef in iter state. > > > all these helpers (symbol_match and elf_get_vername) are called only > > from elf_sym_iter stuff right now, right? > > > > > > > > > > > > + if (!scn) > > > > > + return NULL; > > > > > + if (!gelf_getshdr(scn, &sh)) > > > > > + return NULL; > > > > > + strtabidx = sh.sh_link; > > > > > + verdefs = elf_getdata(scn, 0); > > > > > + > > > > > + offset = 0; > > > > > + while (gelf_getverdef(verdefs, offset, &verdef)) { > > > > > + if (verdef.vd_ndx != ver) { > > > > > + if (!verdef.vd_next) > > > > > + break; > > > > > + > > > > > + offset += verdef.vd_next; > > > > > + continue; > > > > > + } > > > > > + > > > > > + if (!gelf_getverdaux(verdefs, offset + verdef.vd_aux, &verdaux)) > > > > > + break; > > > > > + > > > > > + return elf_strptr(elf, strtabidx, verdaux.vda_name); > > > > > + > > > > > + } > > > > > + return NULL; > > > > > +} [...]