Re: PIC is wasteful

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 24-06-2011 20:10, Agner Fog wrote:
Now I can make a 64 bit shared object with -mcmodel=large and without -fpic and it works. But as I suspected, it is not optimal. It uses full 64 bit addresses for almost everything rather than 32 bit relative addresses. This is inefficient because it makes the code larger and because 64 bit addressing is poorly supported in the x64 instruction set. It loads the full absolute address into a 64-bit pointer register whenever it is reading or writing any register other than eax and when calling an external function.

I need a memory model between medium and large to allow 32-bit relative addresses but not 32-bit absolute addresses.

I found it! There is a gcc option named -fpie which does exactly this. The manual says: "These options are similar to -fpic and -fPIC, but generated position independent code can be only linked into executables."

I tried it on a 64 bit example. What it really does is make relative references whereever it can, even in exception handler tables. Only in a few situations did it make absolute 64-bit references, but no absolute 32-bit references.

With this option, I can make a 64-bit shared object without PLT and GOT, and it works. The only thing I can't do is global variables. I have to avoid global variables or hide them with "static" or "__attribute__((visibility("hidden")))" to avoid the error in the linker which expects a GOT entry.

It does make a GOT entry, though, if there is a virtual table. This makes sense because the virtual table must be shared.

The -fpie option is no advantage in 32-bit mode, it still makes GOT references here.

My conclusion now is:
If you don't need the ability to replace symbols, you can make a shared object much faster in the following way:
In 32-bit Linux: compile without -fpic.
In 64-bit Linux: compile with -fpie instead of -fpic; avoid global variables or hide them. You avoid the GOT and PLT lookups for local references, and you get rid of the clumsy calculation of relative addresses to the GOT in 32 bit mode.

I don't know if this works in BSD and Mac.

Thank you everybody for explaining things to me, I wouldn't have found the solution alone.


[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux