Re: optimizer gives up for large functions

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12-12-05 2:53 PM, Peter Foelsche wrote:
This mail applies to g++ 4.5.2 running on 64bit linux

uname -a:

Linux NAME 2.6.9-89.ELsmp #1 SMP Mon Apr 20 10:33:05 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux

I remember nearly 3 years ago g++ took 1hour and 4GB of memory to compile some of my functions. And the generated code was exceptionally good!
This is not the case anymore -- I don't know since when.
Today I only get good code for small toy examples.
I used the -Wdisabled-optimization and used some parameter settings to get around this
-- but without result:

g++ -fPIC -shared -DNDEBUG -O3 -march=native -I /home/peterf/boost_1_47_0 -std=c++0x --param max-gcse-memory=1073741824 mosfet0.c -Wdisabled-optimization

One problem is that the copy constructor is really being expressed.
I was already thinking that some idiot at my company
changed the compiler to default to -fno-elide-constructors.

Another problem is that many operations are done on the stack instead of between registers.
I'm talking about floating point operations with double.
Too many of the following operations:

movapd
movaps
movddup
movq





Anybody any clue?
Some optimizations were not scaled well and there were a lot changes to speed up them in some extreme cases. One of them is LRA. When function is too big, the simple algorithms are used to speed up LRA. It was done to solve some compilation speed PRs although that I expressed in one my email:

"By the way, I can solve the compilation time problem by using simpler algorithms harming performance. The author will be happy with compilation speed but will be disappointed by saying 10% slower interpreter. I don't think it is a solution the problem, it is creating a bigger problem. It seems to me I have to do this Or if I tell him that waiting 40% more time he can get 15% smaller code, I guess he would prefer this. Of course it is interesting problem to speed up the compiler but we don't look at whole picture when we solve compilation time by hurting the performance."

I guess it is your case.

To be sure with my analysis I need a testcase.

I guess, we can solve the problem through defining some parameters which will be used when to define when LRA should use the simple algorithms. So if you want to get a good code and ready to wait for a hour or two, you could increase the parameters.





[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux