I am trying to speedup the load and startup time of a shared library.
After reading Ulrich Drepper's paper on "How to write shared libraries",
it seems that the easiest thing to try would be to reduce the number of
symbols that are globally visible. After carefully adding
__attribute__((visibility ("default"))) to only the symbols that should
be globally visible and using the gcc option -fvisibility=hidden to hide
all symbols by default, I managed to reduce the number of globally
visible symbols. But now, it seems that even though the number of
symbols needing relocation has decreased, the cost of searching for a
symbol in the "optimized" dso has gone up. Here is the output from
"eu-readelf -I" before and after reducing the number of globally visible
symbols. It seems that the cost of both successful and unsuccessful
lookup has gone up. I haven't done any profiling yet but I am guessing
that my runtime symbol lookup cost will go up. Is this normal and am I
missing something?
BEFORE:
Histogram for bucket list length in section [ 1] '.gnu.hash' (total of
4099 buckets):
Addr: 0x0000000000000158 Offset: 0x000158 Link to section: [ 2] '.dynsym'
Symbol Bias: 652
Bitmask Size: 4096 bytes 26% bits set 2nd hash shift: 15
Length Number % of total Coverage
0 1123 27.4%
1 1470 35.9% 28.1%
2 955 23.3% 64.7%
3 391 9.5% 87.1%
4 132 3.2% 97.2%
5 23 0.6% 99.4%
6 5 0.1% 100.0%
Average number of tests: successful lookup: 1.617107
unsuccessful lookup: 1.274945
AFTER:
Histogram for bucket list length in section [ 1] '.gnu.hash' (total of
2053 buckets):
Addr: 0x0000000000000158 Offset: 0x000158 Link to section: [ 2] '.dynsym'
Symbol Bias: 652
Bitmask Size: 4096 bytes 21% bits set 2nd hash shift: 15
Length Number % of total Coverage
0 288 14.0%
1 576 28.1% 14.7%
2 575 28.0% 44.1%
3 367 17.9% 72.2%
4 165 8.0% 89.0%
5 64 3.1% 97.2%
6 16 0.8% 99.6%
7 2 0.1% 100.0%
Average number of tests: successful lookup: 1.916007
unsuccessful lookup: 1.90794