At 05:39 PM 12/14/2006, Alexander Staubo wrote:
On Dec 14, 2006, at 20:28 , Ron wrote:
Can you do runs with just CFLAGS="-O3" and just CFLAGS="-msse2 -
mfpmath=sse -funroll-loops -m64 - march=opteron -pipe" as well ?
All right. From my perspective, the effect of -O3 is significant,
whereas architecture-related optimizations have no statistically
significant effect.
Is this opinion? Or have you rerun the tests using the flags I
suggested? If so, can you post the results?
If "-O3 -msse2 - mfpmath=sse -funroll-loops -m64 - march=opteron
-pipe" results in a 30-40% speed up over "-O0", and
" -msse2 - mfpmath=sse -funroll-loops -m64 - march=opteron -pipe"
results in a 5-10% speedup, then ~ 1/8 - 1/3 of the total possible
speedup is due to arch specific optimizations.
(testing "-O3" in isolation in addition tests for independence of
factors as well as showing what "plain" "-O3" can accomplish.)
Some might argue that a 5-10% speedup which represents 1/8 - 1/3 of
the total speedup is significant...
But enough speculating. I look forward to seeing your data.
Ron Peacetree