Re: Replication Using Triggers

"Merlin Moncure" <mmoncure@xxxxxxxxx> · Sun, 20 Jan 2008 22:18:19 -0500

On Jan 19, 2008 6:46 PM, Gordan Bobic <gordan@xxxxxxxxxx> wrote:
> Existing solutions can't handle multiple masters. MySQL can do it at
> least in a ring arrangement.

mysql multi-master replication looks a lot better on paper than it
really is...one of the reasons mysql is able to seemingly provide it
so easily is that mysql doesn't take transactions and locks very
seriously whereas postgres does.  Because spreading the locking system
around on multiple servers complicates everything and adds all kinds
of timing issues, you are unlikely to see any real performance benefit
over a single well built server unless your ratio of reads to writes
is extremely high (especially on a relatively high latency link like
ethernet).  Since single master replication will fly in many of those
cases, the use case for multi-master replication is a lot smaller than
it appears on the surface.  multi-master is more appealing if you do
not need locks strictly enforced and you can play fast and loose with
your data...many applications do not have this luxury however.  Many
people who ask for this feature grossly misunderstand the complexity
of the problem and expect there to be a magical solution with few
trade-offs.  mysql's solution to this problem, while quite useful at
solving particular problems, is not a 'solution' in the general sense.
 By the way, many big and/or busy database are bound by the i/o
system, not the cpu.

PostgreSQL chose another path...we like our locks and want our
transactions to give exact and specific answers.  This means a few
compromises on some things you take for granted in mysql (select
count(*) from foo; comes to mind), but maybe a better solution for
solving a wide range of applications where every transaction counts.
PostgreSQL also scales very well too multiple core systems (much
better than mysql, its been reported), so you can always scale the
server up fairly cheaply...x86 can easily scale up to 16 cores these
days fairly cheaply with decent results.  Expect these trends to
continue going forwards...one thing that the PostgreSQL community
might be interested in the future is distributing a single query
across multiple cores...

Anyways, for replication on the single master side you have Slony
(trigger based) which is an effective solution along with some other
supported solutions (mammoth replicator, for example).  At some point
in the future (maybe 8.4?) we will have PITR hot standby which will
likely be the 'in core' replication solution although various 3rd
party tools may extend on it and make it do various things.
Personally, unless I have some easily distributed task or have
insanely high transaction loading requirements (popular web site), I
would build twin servers, each capable of servicing the database
completely and use PITR to maintain a warm standby.  Obviously, this
approach has limits but it's often a good place to start.  If things
are too hot for a single server to handle, maybe it's time to start
dividing up the tasks into logical portions with shared data being
moved around (with db-link, or slony).

merlin

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings