"Phoenix Kiula" <phoenix.kiula@xxxxxxxxx> wrote: > > We're moving from MySQL to PG, a move I am rather enjoying, but we're > currently running both databases. As we web-enable our financial > services in fifteen countries, I would like to recommend the team that > we move entirely to PG. > > In doing research on big installations of the two databases, I read > this from a MySQL senior exec on Slashdot: Senior MySQL exec means this is a marketing blurb, which means it's exaggerated, lacking any honest assessment of challenges and difficulties, and possibly an outright lie. I've no doubt that MySQL can do clusters if you know what you're doing, but if you want the facts, you're going to have to look deeper than that obviously biased quote. I seem to remember a forum thread with someone having considerable difficulty with MySQL cluster, and actual MySQL employees jumping in to try to help and no solution ever found. Anyone have that link lying around? In any event, replication is a large and complex topic. To do it well takes research, planning, and know-how. Anyone who tells you their solution will just drop in and work is either lying or charging you a bunch of money for their consultants to investigate your scenario and set it up for you. First off, "clustering" is a word that is too vague to be useful, so I'll stop using it. There's multi-master replication, where every database is read-write, then there's master-slave replication, where only one server is read-write and the rest are read-only. You can add failover capabilities to master-slave replication. Then there's synchronous replication, where all servers are guaranteed to get updates at the same time. And asynchronous replication, where other servers may take a while to get updates. These descriptions aren't really specific to PostgreSQL -- every database replication system has to make design decisions about which approaches to support. PostgreSQL has some built-in features to allow synchronous multi-master database replication. Two-phase commit allows you to reliably commit transactions to multiple servers concurrently, but it requires support at the application level, which will require you to rewrite any existing applications. Pgcluster is multi-master synchronous replication, but I believe it's still in beta. Note that no synchronous replication system works well over geographically large distances. The time required for the masters to synchronize over (for example) the Internet kills performance to the point of uselessness. Again, this is not a PostgreSQL problem, MSSQL suffers the same problem. Synchronous replication is only really used when two servers are right next to each other with a high-speed link (probably gigabit) between them. PostgreSQL-R is in development, and targeted to allow multi-master, asynchronous replication without rewriting your application. As far as I know, it works, but it's still beta. pgpool supports multi-master synchronous replication as well as failover. Slony supports master-slave asynchronous replication and works _very_ well over long distances (such as from an east coast to a west coast datacenter) Once you've looked at your requirements, start looking at the tool that matches those requirements, and I think you'll find what you need. BTW: does anyone know of a link that describes these high-level concepts? If not, I think I'll write this up formally and post it. -- Bill Moran http://www.potentialtech.com ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@xxxxxxxxxxxxxx so that your message can get through to the mailing list cleanly