Re: Problem with gfortran

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 1/2/2011 11:56 PM, Pietiläinen Ville wrote:
Hi,
I have Dell T5500 desktop with 64 bit Windows 7, 12 GB RAM and two 12 core processors. I have fortran90 source code not written by me. As far as I know the programmer has 32 bit windows and is not familiar with my problems.
The code has many loops which are parallelized using OpenMP.

I compiled using gfortran (I have used many versions from different sources , but current one is 4.6 experimental).  I used command "c:\mingw644\bin\gfortran -fopenmp -m64 -O3 -march=native -fmax-stack-var-size=10000000 ." following with all the files. The program runs beautifully but it seems not to use all computational power there is. As I compiled using Intel Fortran, the code used all my 24 cores 100%, I checked this from task manager. But when I use gfortran version, it may be that 12 of the cores are "parked" doing nothing. Now I tried different input model which is bit larger and now it uses all 24 cores but none of which with full power. It  uses only 50-60% of total CPU.

Is there something I am doing wrong with gfortran or is this just a difference between Intel Fortran and gfortran? I would really like to use gfortran but I would also like to use all computational power my desktop has.

Thank you very much

Best regards,
Ville Pietiläinen

Last time I looked up Dell T5500, it was a standard dual socket Intel CPU platform. Even if you have the 6 core CPUs (not revealed on the Dell site as available), you would get 12 logical processors per socket only by use of HyperThreading. Only a few specialized applications can achieve a performance boost by use of more than one OpenMP thread per core, even when using an OpenMP library which is intended for the purpose (unlike libgomp for mingw). You didn't mention your GOMP_CPU_AFFINITY setting; if your OpenMP application is well written, and you can use default large chunk static scheduling, you would want to use the standard affinity, with contiguous scheduling of ranks. This would be particularly important if you do actually have 12-core CPUs (AMD 6+6 core), where there is no on-die communication between the groups of 6 cores. Likewise, with Intel OpenMP, on an AMD CPU, you would schedule affinity by the numbers (KMP_AFFINITY proclist option). In case you didn't set OMP_NUM_THREADS, I hope you recognize that certain libraries detect the presence of HyperThreading and attempt to default to 1 thread per core, as that normally would be the optimum setting. My own experience with libgomp on Windows (both mingw 64-bit and cygwin 32-bit) has been unsatisfactory, unlike the linux situation, where libgomp is at least competitive with more than one commercial OpenMP. If the problem is due in part to inability to implement GOMP_CPU_AFFINITY for Windows, you can't expect random assignment of threads to help. Gfortran for mingw 64-bit has become excellent, with the exception of the OpenMP implementation. Even Microsoft hasn't been able to demonstrate fully competitive performance under Windows with 6- 10- and 12- core CPUs, at least not for the classes of applications which usually run OpenMP and MPI (which admittedly aren't their priority). But I'm not clear about whether you care about getting performance from OpenMP, as your emphasis seems to be only on keeping more logical processors busy.


--
Tim Prince



[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux