Re: hardware and For PostgreSQL

Ben <bench@xxxxxxxxxxxxxxx> · Wed, 31 Oct 2007 10:24:13 -0700 (PDT)

It would probably help you to spend some time browsing the archives of 
this list for questions similar to yours - you'll find quite a lot of 
consistent answers. In general, you'll find that:

- If you can fit your entire database into memory, you'll get the best
  performance.

- If you cannot (and most databases cannot) then you'll want to get the
  fastest disk system you can.

- For reads, RAID5 isn't so bad but for writes it's near the bottom of the
  options. RAID10 is not as efficient in terms of hardware, but if you
  want performance for both reads and writes, you want RAID10.

- Your RAID card also matters. Areca cards are expensive, and a lot of
  people consider them to be worth it.

- More procs tend to be better than faster procs, because more procs let
  you do more at once and databases tend to be i/o bound more than cpu
  bound.

- More or faster procs put more contention on the data, so getting more or
  better cpus just increases the need for faster disks or more ram.

- PG is 64 bit if you compile it to be so, or if you install a 64-bit
  binary package.

....and all that said, application and schema design can play a far more 
important role in performance than hardware.

On Wed, 31 Oct 2007, Ketema Harris wrote:

I am trying to build a very Robust DB server that will support 1000+ 
concurrent users (all ready have seen max of 237 no pooling being used).  i 
have read so many articles now that I am just saturated.  I have a general 
idea but would like feedback from others.

I understand query tuning and table design play a large role in performance, 
but taking that factor away
and focusing on just hardware, what is the best hardware to get for Pg to 
work at the highest level
(meaning speed at returning results)?

How does pg utilize multiple processors?  The more the better?
Are queries spread across multiple processors?
Is Pg 64 bit?
If so what processors are recommended?

I read this : 
http://www.postgresql.org/files/documentation/books/aw_pgsql/hw_performance/node12.html
POSTGRESQL uses a multi-process model, meaning each database connection has 
its own Unix process. Because of this, all multi-cpu operating systems can 
spread multiple database connections among the available CPUs. However, if 
only a single database connection is active, it can only use one CPU. 
POSTGRESQL does not use multi-threading to allow a single process to use 
multiple CPUs.

Its pretty old (2003) but is it still accurate?  if this statement is 
accurate how would it affect connection pooling software like pg_pool?

RAM?  The more the merrier right? Understanding shmmax and the pg config file 
parameters for shared mem has to be adjusted to use it.
Disks?  standard Raid rules right?  1 for safety 5 for best mix of 
performance and safety?
Any preference of SCSI over SATA? What about using a High speed (fibre 
channel) mass storage device?

Who has built the biggest baddest Pg server out there and what do you use?

Thanks!

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

            http://www.postgresql.org/docs/faq

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
      choose an index scan if your joining column's datatypes do not
      match