On 6/1/07, Madison Kelly <linux@xxxxxxxxxxx> wrote:
After realizing that 'clustering' in the PgSQL docs means multiple DBs behind one server, and NOT multple machines, I am back at square one, feeling somewhat the fool. :P
I remember being similarly disappointed in this rampant co-opting of the word "cluster" back in 7.4 or so. :) A gaggle of geese, a murder of crows, a cluster of databases, I guess.
Can anyone point me to docs/websites that discuss options on replicating in (as close as possible to) realtime? Ideally with load balancing while both/all servers are up, and failover/resyncing when a member fails and is restored.
The PostgreSQL documentation gives a pretty good overview of the options: http://www.postgresql.org/docs/8.2/interactive/high-availability.html That said, there is to my knowledge no single, integrated product that will do all you ask. None are capable of anything near real-time, automatic failover tends to be left as an exercise for the reader, and there is a lot of work to get it up and running, and requires particular care in maintenance and monitoring once it's up. There are several commercial (Mammoth Replicator comes to mind) and several open-source projects. Among the open-source ones (Slony-I, pgpool, PGCluster), I believe Slony-I is the most mature. There are a few in-progress attempts (pgpool-II, PGCluster 2, PostgreSQL-R) that are not ready for prime time yet; of these, I believe pgpool-II is the most promising. As mentioned in a different thread today, work is being done to implement WAL-based master-slave replication, which I think should prove more scalable and more transparent than the current third-party products: http://archives.postgresql.org/pgsql-hackers/2007-03/msg00050.php
I've looked at slony, but it looks more like a way to push occasional copies to slaves, and isn't meant to be real time. Am I wrong by chance?
Slony is indeed intended for near-real-time replication; it's asynchronous, so slaves always lag behind the master. The amount of discrepancy depends on a bunch of factors -- individual node performance, network performance, and system load. Alexander.