On 12-12-05 2:53 PM, Peter Foelsche wrote:
This mail applies to g++ 4.5.2 running on 64bit linux
uname -a:
Linux NAME 2.6.9-89.ELsmp #1 SMP Mon Apr 20 10:33:05 EDT 2009 x86_64
x86_64 x86_64 GNU/Linux
I remember nearly 3 years ago g++ took 1hour and 4GB of memory to
compile some of my functions. And the generated code was exceptionally
good!
This is not the case anymore -- I don't know since when.
Today I only get good code for small toy examples.
I used the -Wdisabled-optimization and used some parameter settings to
get around this
-- but without result:
g++ -fPIC -shared -DNDEBUG -O3 -march=native -I
/home/peterf/boost_1_47_0 -std=c++0x --param
max-gcse-memory=1073741824 mosfet0.c -Wdisabled-optimization
One problem is that the copy constructor is really being expressed.
I was already thinking that some idiot at my company
changed the compiler to default to -fno-elide-constructors.
Another problem is that many operations are done on the stack instead
of between registers.
I'm talking about floating point operations with double.
Too many of the following operations:
movapd
movaps
movddup
movq
Anybody any clue?
Some optimizations were not scaled well and there were a lot changes to
speed up them in some extreme cases. One of them is LRA. When function
is too big, the simple algorithms are used to speed up LRA. It was done
to solve some compilation speed PRs although that I expressed in one my
email:
"By the way, I can solve the compilation time problem by using simpler
algorithms harming performance. The author will be happy with
compilation speed but will be disappointed by saying 10% slower
interpreter. I don't think it is a solution the problem, it is creating
a bigger problem. It seems to me I have to do this Or if I tell him
that waiting 40% more time he can get 15% smaller code, I guess he would
prefer this. Of course it is interesting problem to speed up the
compiler but we don't look at whole picture when we solve compilation
time by hurting the performance."
I guess it is your case.
To be sure with my analysis I need a testcase.
I guess, we can solve the problem through defining some parameters which
will be used when to define when LRA should use the simple algorithms.
So if you want to get a good code and ready to wait for a hour or two,
you could increase the parameters.