Re: Hardware for PostgreSQL

Steve Crawford <scrawford@xxxxxxxxxxxxxxxxxxxx> · Thu, 01 Nov 2007 09:20:46 -0700

Ketema wrote:
> I am trying to build a very Robust DB server that will support 1000+
> concurrent users (all ready have seen max of 237 no pooling being
> used).  I have read so many articles now that I am just saturated.  I
> have a general idea but would like feedback from others.

Describe a bit better. 1,000 users or 1,000 simultaneous connections?
Ie, do you have a front-end where someone logs on, gets a connection,
and keeps it for the duration or is it a web-type app where each request
might connect-query-disconnect? If the latter, are your connections
persistent? How many queries/second do you expect?

How complex are the queries (retrieve single record or data-mining)?
Read-only or lots of updates? Do the read-queries need to be done every
time or are they candidates for caching?

> RAM?  The more the merrier right?

Generally, true. But once you reach the point that everything can fit in
RAM, more is just wasted $$$. And, according to my reading, there are
cases where more RAM can hurt - basically if you manage to create a
situation where your queries are large enough to just flush cache so you
don't benefit from caching but are hurt by spending time checking cache
for the data.

> Who has built the biggest baddest Pg server out there and what do you
> use?

Not me.

Someone just showed me live system monitoring data on one of his several
PG machines. That one was clocking multi-thousand TPS on a server
(Sun??) with 128GB RAM. That much RAM makes "top" look amusing.

Several of the social-networking sites are using PG - generally
spreading load over several (dozens) of servers. They also make heavy
use of pooling and caching - think dedicated memcached servers offering
a combined pool of several TB RAM.

For pooling, pgbouncer seems to have a good reputation. Tests on my
current production server show it shaving a few ms off every
connect-query-disconnect cycle. Connects are fairly fast in PG but that
delay becomes a significant issue under heavy load.

Test pooling carefully, though. If you blindly run everything through
your pooler instead of just selected apps, you can end up with
unexpected problems when one client changes a backend setting like "set
statement_timeout to 5". If the next client assigned to that backend
connection runs a long-duration analysis query, it is likely to fail.

Cheers,
Steve

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
       choose an index scan if your joining column's datatypes do not
       match