Re: disambiguate position-independent code and position-independent executable

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 2021-04-22 at 11:06 -0500, Peng Yu wrote:
> On 4/22/21, Xi Ruoyao <xry111@xxxxxxxxxxxxxxxx> wrote:
> > [snip]
> > 
> > I have some code showing the semantic interposition breakage by
> > misusing
> > -fPIE for a lib:
> > 
> > https://linux.xidian.edu.cn/git/xry111/pie_vs_pic
> 
> ### program output
> running the program linked to the correct shared object:
> ./exe_link_to_pic
> foo in exe
> foo in exe
> 
> running the program linked to the buggy (-fPIE) shared object:
> ./exe_link_to_pie
> foo in exe
> foo in lib
> ###
> 
> According to my understanding, it is always the foo in the lib.c be
> called by bar in the same file lib.c, because when lib.c is written,
> people should understand so no matter how the library is linked.
> 
> So I'd think the -fPIE result is correct. Why do you think the -fPIC
> result is correct?

TL;DR:  System V-style shared libraries use global not lexical scoping.

Because that’s how the System V model of shared libraries (but not e.g.
the Windows model of shared libraries) is _defined to work_[1]:  the
loader chucks all symbol definitions from the executable and the shared
libraries into a single large bag (the global namespace) as it loads
them, then resolves all relocations in all libraries by looking into
that bag;  a definition of a symbol closer to the root of the
dependency tree (the executable) always overrides over a definition
farther from it, regardless of where it’s referenced.  Also, when a
shared library is compiled and linked, the default behaviour is to
leave all externally visible symbols as unresolved relocations.

Those two things in conjunction mean that, indeed, when a function
defined inside a shared library calls another globally visible function
(apparently) defined inside that shared library, what this function
ends up actually calling depends on how that library ends up being used
and _cannot_ be inferred from the code of the library alone.  This may
seem perverse, but that’s why you can globally replace all calls to
malloc() [dmalloc] or connect() [socksify] throughout a dynamically
linked program by setting LD_PRELOAD:  these calls are not resolved
yet, neither in the executable nor in the shared libraries, not even
inside libc.  This possibility of overriding even intra-library calls
goes by the fancy name of “semantic interposition” mentioned above.

I _think_ this was done so that shared libraries sort of work like
static libraries (where the main executable is indeed able to override
a function inside a static library in some cases), even if in other
potentially useful ways they don’t (no weak symbols).  But this is
speculation, I don’t have a reference.

You cannot make the loader behave differently, but you _can_ make the
linker resolve all library-internal references at link time and not
leave them until runtime using things like -Wl,-Bsymbolic, -fno-
semantic-interposition and -fvisibility= (they’re all different, read
the docs).  But this means that you will be breaking people’s
expectations of how shared libraries work on their platform, and that
should always be done carefully, even if you think the current design
is stupid and wrong.  (I do.)

[1]: https://refspecs.linuxbase.org/elf/elf.pdf

-- 
Cheers,
Alex

Attachment: signature.asc
Description: This is a digitally signed message part


[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux