At 04:54 AM 12/15/2006, Alexander Staubo wrote:
On Dec 15, 2006, at 04:09 , Ron wrote:
At 07:27 PM 12/14/2006, Alexander Staubo wrote:
Sorry, I neglected to include the pertinent graph:
http://purefiction.net/paste/pgbench2.pdf
In fact, your graph suggests that using arch specific options in
addition to -O3 actually =hurts= performance.
According to the tech staff, this is a Sun X4100 with a two-drive
RAID 1 volume. No idea about the make of the hard drives.
Alexander.
http://www.sun.com/servers/entry/x4100/features.xml
So we are dealing with a 1U 1-4S (which means 1-8C) AMD Kx box with
up to 32GB of ECC RAM (DDR2 ?) and 2 Seagate 2.5" SAS HDs.
http://www.seagate.com/cda/products/discsales/index/1,,,00.html?interface=SAS
My bet is the X4100 contains one of the 3 models of Cheetah
15K.4's. A simple du, dkinfo, whatever, will tell you which.
I'm looking more closely into exactly what the various gcc -O
optimizations do on Kx's as well.
64b vs 32b gets x86 compatible code access to ~ 2x as many registers;
and MMX or SSE instructions get you access to not only more
registers, but wider ones as well.
As one wit has noted, "all optimization is an exercise in caching."
(Terje Mathisen- one of the better assembler coders on the planet.)
It seems unusual that code generation options which give access to
more registers would ever result in slower code...
Ron Peacetree