On 19/05/2019 22:24, Arvind Sankar wrote:
On Sun, May 19, 2019 at 09:26:53PM +0200, David Brown wrote:
On 19/05/2019 20:49, Arvind Sankar wrote:
Hi, I am trying to understand the code generated when accessing global
variables on PIE code and the reason for some differences between 32-bit
and 64-bit cases.
For the example input file
int a, b;
int f(void) { return a + b; }
With -m64 identical code is generated between -fcommon and -fno-common:
movl b(%rip), %eax
addl a(%rip), %eax
using R_X86_64_PC32 relocations. So in both cases the calculation is
S + A - P and we do not use any GOT entries.
With -m32 -fno-common:
call __x86.get_pc_thunk.dx
addl $_GLOBAL_OFFSET_TABLE_, %edx
movl b@GOTOFF(%edx), %eax
addl a@GOTOFF(%edx), %eax
using R_386_GOTOFF relocations. Calculation is S + A - GOT. The code
does not go through the GOT entries.
With -m32 -fcommon:
call __x86.get_pc_thunk.ax
addl $_GLOBAL_OFFSET_TABLE_, %eax
movl a@GOT(%eax), %edx
movl b@GOT(%eax), %eax
movl (%eax), %eax
addl (%edx), %eax
using R_386_GOT32X relocations calculated as G + A - GOT. This time we
use the GOT entries.
Why can the -m32 -fcommon case not use the same code as the -fno-common
case in order to avoid indirecting through the GOT, while for 64-bit we can
achieve that regardless of the -fcommon setting? For 64-bit, indirection
through GOT appears to be done only with -fPIC.
PS: Separate comment on 32-bit code: if the function only uses
R_386_GOTOFF relocations, is it not possible to replace them with
R_386_PC32, adjusting the addend for the number of instruction bytes
between the prologue code where we loaded the PC and the instruction
that accesses the variable? This would eliminate the instruction to add
$_GLOBAL_OFFSET_TABLE_. With the addition of an R_386_GOTPCREL
relocation similar to the 64-bit case, the R_386_GOT32 relocations could
also be handled that way.
Why not just use -fno-common and be happy? The existence of "common"
symbols is a massive design fault, IMHO - it comes from a time before C
was standardised and when C compilers behaved differently. Any program
that relies on "-fcommon" being active is relying on non-standard
behaviour. gcc is very good at providing options and flags to deal with
legacy code or code that relies on non-standard behaviour (like
-fno-strict-aliasing, -fwrapv), and it is great that it provides
-fcommon for similar purpose. But "-fno-common" should have been the
default from day one of gcc - it gives better code object code,
encourages clearer and more correct (and portable) source code, and
gives better static error checking. Unless your code relies on
"-fcommon", and you can't fix the code, my advice is to use "-fno-common".
I'm just trying to understand the differences, I don't actually plan to
write code that relies on common variables.
Fair enough.
Have a look at the "-fsection-anchors" option too. It can have a good
effect on reducing GOT accesses, but AFAIK it will have little effect on
variables that have gone in a "common" section. On the targets I use
most, which are embedded systems with absolute linking, "-fno-common
-fsection-anchors" can make a significant difference to variable access
efficiency.