On 04/01/2013 04:58 AM, Adam Jackson wrote: > On Fri, 2013-03-29 at 10:48 -0700, John Reiser wrote: > >> -fPIE code is larger and takes longer to execute. The cost varies from >> minimal (< 2%) in many cases to 10% or more for "non-dynamic" arrays on i686. > > Citation needed. ftp://ftp.inf.ethz.ch/doc/tech-reports/7xx/766.pdf which is cited by the FESCO ticket https://fedorahosted.org/fesco/ticket/1104#comment:11 It's also easy to see the mechanism: $ cat foo.c extern int a[]; void foo(int j) { a[j]=j; } $ gcc -m32 -fPIE -O -S foo.c $ cat foo.s # edited for brevity foo: # 25 bytes; about 15 cycles (incl. 3*3 cycles data cache fetch latency) call __x86.get_pc_thunk.cx addl $_GLOBAL_OFFSET_TABLE_, %ecx movl 4(%esp), %eax movl a@GOT(%ecx), %edx movl %eax, (%edx,%eax,4) ret $ gcc -m32 -O -S foo.c $ cat foo.s # edited for brevity foo: # 12 bytes; about 6 cycles (incl. 1*3 cycles data cache fetch latency) movl 4(%esp), %eax movl %eax, a(,%eax,4) ret $ -fPIE forces an additional level of run-time indirection which often costs around 13 bytes (CALL + ADD + fetch GOT - d32) and 2 to 5 cycles (fetch @GOT and cache latency). Some of the cost might be shared with other nearby uses, but scarcity of registers often inhibits sharing or requires spill code. > >> -fPIE for Thumb mode on ARM is particularly painful. > > Citation needed. The same code above applies. Thumb mode has no double indexing, so an explicit ADD is required. Registers are in still in short supply; HI registers (>=8) have dedicated usage or restricted access. Also, the range of the offset in base_register+offset addressing mode is severely restricted, which often requires more explicit ADDs. -- -- devel mailing list devel@xxxxxxxxxxxxxxxxxxxxxxx https://admin.fedoraproject.org/mailman/listinfo/devel