Is there a performance penalty for accessing global with rip on architecture as new as nehalem, in comparsion to access variables using rsp? >>>> 4006f9: f0 80 0d bf 0b 20 00 lock orb $0x0,0x200bbf(%rip) >>>> # 6012c0 <t2lockor> >>>> 400700: 00 >>>> 4006f9: f0 80 0d bf 0b 20 00 lock orb $0x0,0x0(%rsp) >>>> # 6012c0 <t2lockor> >>>> 400700: 00 The later seems to perform better. Thanks Xin On Tue, Feb 7, 2012 at 12:44 PM, Andrew Haley <aph@xxxxxxxxxx> wrote: > On 02/07/2012 05:33 PM, Xin Tong wrote: >> On Tue, Feb 7, 2012 at 12:09 PM, Andrew Haley <aph@xxxxxxxxxx> wrote: >>> On 02/07/2012 04:56 PM, Xin Tong wrote: >>>> I am wondering how gcc accesses global variables on x86. from the code >>>> i have seen so far, it seems to use the %RIP as the base register. Is >>>> it always like this? >>>> >>>> 4006f9: f0 80 0d bf 0b 20 00 lock orb $0x0,0x200bbf(%rip) >>>> # 6012c0 <t2lockor> >>>> 400700: 00 >>>> >>>> t2lockor is a global variables. >>> >>> This is x86_64, I think. The answer is that it depends on whether you >>> are using PIC, and the model you're using. Try -mcmodel=large for a >>> variation. PC-relative loads are convenient for everything except the >>> large memory model. >> >> What are large memory model and PIC, can you please briefly explain. > > For large memory model see -mcmodel in the gcc docs. > PIC means position independent code, as in the gcc command -fpic. > >> the PC of instructions are going to change in the linkage stage. so >> the linker patches the offset if rip is used to access the global >> variables. > > Yes. > > Andrew.