Global variables: -fno-common vs -fcommon on x86 32-bit vs 64-bit

Arvind Sankar <nivedita@xxxxxxxxxxxx> · Sun, 19 May 2019 14:49:54 -0400

Hi, I am trying to understand the code generated when accessing global
variables on PIE code and the reason for some differences between 32-bit
and 64-bit cases.

For the example input file
int a, b;
int f(void) { return a + b; }

With -m64 identical code is generated between -fcommon and -fno-common:
        movl    b(%rip), %eax
        addl    a(%rip), %eax
using R_X86_64_PC32 relocations. So in both cases the calculation is
S + A - P and we do not use any GOT entries.

With -m32 -fno-common:
        call    __x86.get_pc_thunk.dx
        addl    $_GLOBAL_OFFSET_TABLE_, %edx
        movl    b@GOTOFF(%edx), %eax
        addl    a@GOTOFF(%edx), %eax
using R_386_GOTOFF relocations. Calculation is S + A - GOT. The code
does not go through the GOT entries.

With -m32 -fcommon:
        call    __x86.get_pc_thunk.ax
        addl    $_GLOBAL_OFFSET_TABLE_, %eax
        movl    a@GOT(%eax), %edx
        movl    b@GOT(%eax), %eax
        movl    (%eax), %eax
        addl    (%edx), %eax
using R_386_GOT32X relocations calculated as G + A - GOT. This time we
use the GOT entries.

Why can the -m32 -fcommon case not use the same code as the -fno-common
case in order to avoid indirecting through the GOT, while for 64-bit we can
achieve that regardless of the -fcommon setting? For 64-bit, indirection
through GOT appears to be done only with -fPIC.

PS: Separate comment on 32-bit code: if the function only uses
R_386_GOTOFF relocations, is it not possible to replace them with
R_386_PC32, adjusting the addend for the number of instruction bytes
between the prologue code where we loaded the PC and the instruction
that accesses the variable? This would eliminate the instruction to add
$_GLOBAL_OFFSET_TABLE_. With the addition of an R_386_GOTPCREL
relocation similar to the 64-bit case, the R_386_GOT32 relocations could
also be handled that way.