Dear gcc-help list,
I noticed that starting an OpenMP parallel section takes a significant
amount of time on Nehalem cpu's with hyper-threading enabled.
The differences with HTT turned on and off are really huge:
- HTT disabled: about 100.000 parallel sections per second
- HTT enabled: about 15 parallel sections per second
Is this a known problem? It has apparently something to do with setting
the cpu affinity; when I set the GOMP_CPU_AFFINITY environment variable
to "0-7", then it is almost as fast as with HTT disabled...
This is the code I used to test it. Simply compile it with -fopenmp. I
used 100.000 iterations instead of 100 to time it with HTT disabled.
========================
int main () {
int i;
for (i = 0; i < 100; i++) {
#pragma omp parallel
{
}
}
}
========================
System specs:
OS: Ubuntu 9.10, amd64 (2.6.31-19-generic)
gcc: version 4.4.1 (Ubuntu 4.4.1-4ubuntu9)
cpu: Intel(R) Core(TM) i7 CPU 920 @ 2.67GHz
Cheers,
Edwin