On Sun, May 19, 2019 at 10:42:25PM +0300, Alexander Monakov wrote: > On Sun, 19 May 2019, Arvind Sankar wrote: > > > Hi, I am trying to understand the code generated when accessing global > > variables on PIE code and the reason for some differences between 32-bit > > and 64-bit cases. > > > [snip] > > > > Why can the -m32 -fcommon case not use the same code as the -fno-common > > case in order to avoid indirecting through the GOT, while for 64-bit we can > > achieve that regardless of the -fcommon setting? For 64-bit, indirection > > through GOT appears to be done only with -fPIC. > > I don't have first-hand info, but note that GCC essentially treats common > definitions similarly to extern declarations in this circumstances (i.e. > -fpie -m64 uses PC-relative access, -fpie -m32 uses GOT indirection). > Testing on godbolt.org shows that LLVM changed its behavior there between > 3.8 and 3.9. > > If at link time it's decided that prevailing definition of a symbol > comes from a shared library (it's possible for common symbols to be > preempted by strong symbols from shared libraries, ELF spec leaves this > choice up to the implementation), but some code in the executable uses > a PC-relative access, a COPY relocation will be needed so that the > definition is copied by the dynamic linker at program startup. So maybe > the idea is to avoid creating copy relocations?.. Maybe -- tested and 64-bit version generates COPY relocs, 32-bit version doesn't. Does seem strange why the two archs are treated differently. > > Also note how telling the compiler that the common symbol cannot be > preempted with __attribute__((visibility("hidden"))) avoids the > indirection. > > (filing a bug might be a more sure way to get a definitive answer :) > > > > PS: Separate comment on 32-bit code: if the function only uses > > R_386_GOTOFF relocations, is it not possible to replace them with > > R_386_PC32, adjusting the addend for the number of instruction bytes > > between the prologue code where we loaded the PC and the instruction > > that accesses the variable? This would eliminate the instruction to add > > $_GLOBAL_OFFSET_TABLE_. With the addition of an R_386_GOTPCREL > > relocation similar to the 64-bit case, the R_386_GOT32 relocations could > > also be handled that way. > > This is probably a known deficiency reported in this bug: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70146 > > Alexander