Re: Redundant file server for postgres

Karl Denninger <karl@xxxxxxxxxxxxx> · Sun, 16 Mar 2008 14:02:10 -0500

Craig Ringer wrote:
Robert Powell wrote:
To whom it may concern,

I'm looking for a file server that will give me a high level of
redundancy and high performance for a postgres database.
For strong redundancy and availability you may need a secondary server 
and some sort of replication setup (be it a WAL-following warm spare, 
slony-I, or whatever). It depends on what you mean by "high".

As for performance - I'm still learning on this myself, so treat the 
following as being of questionable accuracy.

As far as I know the general rule for databases is "if in doubt, add 
more fast disks". A fast CPU (or depending on type of workload several 
almost-as-fast CPUs) will be nice, but if your database is big enough 
not to fit mostly in RAM you'll mostly be limited by disk I/O. To 
increase disk I/O performance, in general you want more disks. Faster 
disks will help, but probably not as much as just having more of them.

More RAM is of course also nice, but might make a huge difference for 
some workloads and database types and relatively little for others. If 
doubling your RAM lets the server cache most of the database in RAM 
it'll probably speed things up a lot. If doubling the RAM is the 
difference between 2% and 4% of the DB in RAM ... it might not make 
such a difference (unless, of course, your queries mostly operate on a 
subset of your data that's fairly similar to your RAM size, you do 
lots of big joins, etc).

Various RAID types also have implications for disk I/O. For example, 
RAID-5 tends to have miserable write performance.

In the end, though, it depends a huge amount on your workload. Will 
you have huge numbers of simpler concurrent transactions, or 
relatively few heavy and complex ones? Will the database be 
read-mostly, or will it be written to very heavily? Vaguely how large 
is your expected dataset? Is all the data likely to be accessed with 
equal frequency or are most queries likely to concentrate on a small 
subset of the data? And so on...

--
Craig Ringer

The key issue on RAM is not whether the database will fit into RAM (for 
all but the most trivial applications, it will not)

It is whether the key INDICES will fit into RAM.  If they will, then you 
get a HUGE win in performance.

If not, then it is all about disk I/O performance and the better you can 
spread that load across multiple spindles and get the data into the CPU 
at a very high rate of speed, the faster the system will perform.

In terms of redundancy you have to know your workload before designing a 
strategy.  For a database that is almost all queries (few 
inserts/updates) the job is considerably simpler than a database that 
sees very frequent inserts and/or updates.

Karl Denninger (karl@xxxxxxxxxxxxx)
http://www.denninger.net

--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general