On Wed, Jul 02, 2008 at 04:29:32PM +0200, Vincent Lefevre wrote: > On one of my programs (that has many branches in the internal loop), > I've found that gcc 4.3.1 generates less efficient code than gcc 4.1.2. > Now, I'm not sure I select the right optimization options. > > For instance, here are various timings I got on various x86_64 machines. > Is there something else I should test? Could this be regarded as a bug > in gcc 4.3 (though the code is correct, the timing is unexpected)? > > In the tables below, pgen=0 means without profile generation, and > pgen=8 means a first compilation with -fprofile-generate, a test on > a subset, a second compilation with -fprofile-use, and the timing > on the obtained binary. Without having the code in a bug report, there is no way to say what the problem is. It is best if you can take some time to reduce the code to an example that shows clearly where the slowdown occurs. You can use normal -pg profiling, oprofile, or tools like Code Analyst/Vtune to identify where the hot spots are if you don't already know where the hot function is. The simpler you make the example, the more likely somebody will fix it (unless you pay somebody to fix it, and then presumably as part of the investigation, they will reduce it). > Since this is code meant to run for millions of hours, the efficiency > is really important. Then presumably it is worth some time to report the bug in a usable fashion. -- Michael Meissner email: gnu@xxxxxxxxxxxxxxxxx http://www.the-meissners.org