Question on object code layout
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Hi,
Does GCC provide any mechanisms to optimise how the object code is
laid out? Or perhaps put in other words, which order functions are
output into the assembler?
We are repetition of template code in our shared libraries, with a
significant number of duplicates across several libraries. (We have
~1500 shared libraries, of which several hundred will be used by any
one program.) Since at run time only one instance of each template
will be used, this is leaving a lot of unused "holes" in the text
segments. We have also observed that the template instances appear
to be scattered (in shared library linear address space) amongst
normal code.
While we are looking at reducing the number of shared libraries, we'd
also like to investigate optimising the internal structure of each
library to have better locality for L1 instruction cache and less
pressure on ITLB. Do you have any suggestions on how to re-order the
contents of a single shared library better, or even the contents of a
single object file? In particular, is there a way to tell GCC to
output "normal" code first, and only then all the template instances,
or vice versa? Any pointers to documentation anywhere would be very
welcome!
We are currently using GCC 3.4.5 on RHEL4 derived Linux system, but
are also starting tests with GCC 4.x. Upgrading from RHEL4 is not
realistically possible, but we can build our own version of binutils
if we can identify a version that's "safe" for use.
We assume we are looking for something a bit like SGI's "cord" or GNU
rope, or some profile-guided link compile/link optimisation, but we
didn't yet come across anything that was event remotely alive on
Linux. We hope we've just missed it...
Thanks in advance!
Lassi
[Index of Archives]
[Linux C Programming]
[Linux Kernel]
[eCos]
[Fedora Development]
[Fedora Announce]
[Autoconf]
[The DWARVES Debugging Tools]
[Yosemite Campsites]
[Yosemite News]
[Linux GCC]