Re: gfortran 4.2 with openMP: why no speedup?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Nelson,

Thanks for your advice. I just figured out that the perceived lack of
a speedup was illusory: I was looking at the CPU time rather than the
wall-clock time, so that resolved my primary concern, but...

> (1) Fortran arrays are stored with the first subscript increasing most
> rapidly, the opposite of that used for C and C++.  Reversing the loop
> order will make better use of cache.

This made a huge difference whether using OpenMP or not, thanks!

> (2) The second problem is the dimensions ("I've set nx and ny so large
> (1000 and 5000...").  To avoid cache conflicts, you want to choose the
> number of rows to be something other than a power of 2: a prime number
> is often a good choice.  I have an example in my files of a program
> that ran about 3 times faster just by changing a row dimension from
> 256 (where there were cache collisions along the row) to 257 (where
> cache collisions are rare).

I would NEVER have figured this out, thanks. In the current
application the problem dictates the sizes of my arrays, so I can't
really use the tip, but I'll keep it in mind in the future.

> You should also check the generated assembly code (f77 -S foo.f)
> whether C(i,j)**2 is compiled into the inline code C(i,j)*C(i,j), or
> into call to the run-time library power function, and also whether the
> subscript address computations are eliminated.

I'll just inline it manually to be sure. I was trying to get a speedup
from openMP in that subroutine, not necessarily optimize overall.

Anand

[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux