Re: New to PostgreSQL, performance considerations

"Merlin Moncure" <mmoncure@xxxxxxxxx> · Fri, 15 Dec 2006 09:23:43 -0500

On 12/15/06, Ron <rjpeace@xxxxxxxxxxxxx> wrote:
I'm looking more closely into exactly what the various gcc -O
optimizations do on Kx's as well.
64b vs 32b gets x86 compatible code access to ~ 2x as many registers;
and MMX or SSE instructions get you access to not only more
registers, but wider ones as well.

As one wit has noted, "all optimization is an exercise in caching."
(Terje Mathisen- one of the better assembler coders on the planet.)

It seems unusual that code generation options which give access to
more registers would ever result in slower code...

The slower is probably due to the unroll loops switch which can
actually hurt code due to the larger footprint (less cache coherency).

The extra registers are not all that important because of pipelining
and other hardware tricks.  Pretty much all the old assembly
strategies such as forcing local variables to registers are basically
obsolete...especially with regards to integer math.  As I said before,
modern CPUs are essentially RISC engines with a CISC preprocessing
engine laid in top.

Things are much more complicated than they were in the old days where
you could count instructions for the assembly optimization process.  I
suspect that there is little or no differnece between the -march=686
and the various specifc archicectures.  Did anybody think to look at
the binaries and look for the amount of differences?  I bet you code
compiled for march=opteron will just fine on a pentium 2 if compiled
for 32 bit.

merlin