Hi, I have just run and timed a couple of tutorial examples for openMP using gcc (GCC) 4.2.1 (Ubuntu 4.2.1-5ubuntu4) on a dual core Athlon amd64, with OMP_NUM_THREADS set to 1 and 2, and occasionally 8 I found that 1 thread outperforms 2 by almost 2:1 on all the examples, and 8 is only fractionally slower than 2. The code was compiled with just -fopenmp, no optimisation switches. OS: Linux, Ubuntu gutsy (7.10) with Linux 2.26.22-14-rt (with real time patches). I confirmed by observing system monitor 1 thread maxes out 1 CPU, and 2 maxes out both, also the observable behaviour was correct and as expected.. it was just SLOW. I've briefly looked at the current SVN source for libgomp and can't see anything wrong there. Can anyone venture an explanation as to what might be going wrong? At least one of the tests has long independent tasks (many seconds) for each thread, so it doesn't seem to be a synchronisation issue. Differences like: 4.7 second for 1 thread, 13 seconds for two were regularly observed in ALL the tests. Note: these are the real time figures, the CPU times were even worse. [BTW: I built current SVN for gcc as well, but the installed result didn't run properly due to a missing .spec file, so I couldn't check if it was any different] -- John Skaller <skaller at users dot sf dot net> Felix, successor to C++: http://felix.sf.net