Re: GCC is 7 times slower than Intel? How to optimize? Need help!!

John Fine <johnsfine@xxxxxxxxxxx> · Sat, 04 Oct 2008 17:42:56 -0400

Recently, I used OProfile to track down a performance problem that I 
think is the same as you are seeing.  But the ratio between the two was 
MUCH higher in the case I was testing.

If your issue turns out to be something different, I suggest oprofile as 
the best tool to identify it.

If it is the same issue, it is not the code generated by the compiler 
that matters, it is something strange that happens inside the gnu 
version of exp() in libm that doesn't happen in the Intel version of 
exp() in libimf.

To work around this problem, I link in Intel's libimf ahead of libm even 
when using the gcc compiler.

I haven't had time to dig through the source code of gnu exp() to figure 
out what is really going on.  But both oprofile and gdb indicated that 
exp() sometimes calls out to a VERY slow multi precision routine.  That 
can take a thousand times longer for one exp() call than the Intel 
version.  The overall performance ratio is then determined by what 
fraction of your exp() calls cause the gnu exp() code to decide to use 
the super slow version.

If any experts are reading this thread and have a better understanding 
of the issue, I'd like the answer.  I didn't investigate myself much 
more than explained above.

jackfrost wrote:
//very simple array function calculation:
#include "math.h"
#include "time.h"
#include "stdio.h"

static double A[50000000];

int main(int argc, char *argv)
{
for (int t=0;t<50000000;t++) 
  A[t]=5.55*sin(t);   //random data

time_t time0 = clock();  

for (int t=0;t<50000000;t++)
  A[t]=exp(A[t]);    

printf("%g\n", ((double)(clock()-time0))/CLOCKS_PER_SEC);
}

Time for this code compiled with Intel10 compiler is 1.2sec.
Result for code compiled with GCC(v3 and v4) is 7.2sec.

I've tried all optimization options: -mfpmath=sse -msse2 -O3
-mtune=pentium-m
But still intel is 7 times faster.