On Fri, 18 Jan 2008, Erik Jones wrote:
Is there any reason why PostgreSQL replication solutions are all add-on 3rd
party ones?
Because no one solution would be appropriate for everyone. The core team and
contributors feel that their time is better spent on the database itself
rather than developing and maintaining multiple different replication
solutions and dealing with the support thereof. What has been done is to add
some extra hooks in 8.3 for replication triggers that can help to specialize
when/if a given trigger fires.
Hmm, selective trigger firing sounds interesting.
Is there any reason why replication couldn't be implemented using triggers
and a handful of stored procedures?
That's usually how it's done. Well, plus some external user-land application
libraries.
That's just it - I don't think any user-land libraries would actually be
required. One of supposed big advantages of MySQL is it's straightforward
replication support. It's quite painful to see PostgreSQL suffer purely
for the sake of lack of marketting in this department. :-(
This is what I have in mind:
Have a plperl function that creates connections to all servers in the
cluster (replication partners), and issues the supplied write query to
them, possibly with a tag of some sort to indicated it is a replicated
query (to prevent circular replication).
Have a post execute trigger that calls the above replication function if
the query was issued directly (as opposed to replicated), and passes it the
query it just executed if it was successful.
If the replication failed on any node, the whole thing gets rolled back.
That sounds pretty brittle. Do you really want all progress in your
databases to stop if there is a network issue to a single server?
1) That's what MySQL does (it either ignores errors or stops replication
on encountering an error, which of those two it does is selectable, but
that's about it).
2) If there is a network/server issue, that would be detected because the
$dbh would break. If the $dbh breaks, then plperl can either attempt to
re-connect, or failing that, boot the node from the replication cluster
(could have a node list with active/fenced flag in a separate config
schema). Neither approach would be difficult to implement.
Then it could just not bother reconnecting the fenced node until the user
updates the node status in the said config schema table. Configuration
schema could also be replicated.
This would effectively give star topology synchronous replication with very
little effort, and no need for any external code. Am I missing something
obvious that would prevent this from working? It would give replication
capabilities better than MySQL's (which can only handle ring based
multi-master replication) for the sake of about 100 lines of code. None of
the required functionality required is new to PostgreSQL, either.
Is there an existing implementation of this? Perhaps a perl program that
creates the required triggers and stored procedures from looking at a
schema?
What you've described here would be pretty simple to implement. However, I
think you've greatly underestimated the performance issues involved. If you
need to push data to multiple databases before each transaction commits I
think you'll find that pretty slow.
Only if transactions are used. I'm basing the requirements on "at least as
good as MySQL", which this would meet without transactions. If
transactions are wanted they could be enabled, otherwise it could just be
fire and forget asynchronous replication a-la MySQL. Having a choice
between transactions and speed is good. :-)
One thing I haven't quite thought of a good way to do with this approach
is the equivalent of the useful (albeit deprecated) LOAD DATA FROM MASTER
command, that gets the server in sync by dropping and re-loading all the
tables from the master(s) (or rather, peers in a multi-master star
replication), and enables it in the replication. It would be neater than
requiring downtime or global write locks. But I guess that could wait
until version 2. :)
That's why most of the available third
party solutions are asynchronous. The biggest options are out there are
Slony and Londiste (both master-slave, asynchronous) and Bucardo
(asynchronous, but supports both master-master and master-slave) which, as
you would have it, is written in Perl.
I looked at all of the above, and they all seemed (to meat least) to
involve unnecessary complication or limitations I saw as unreasonable (or
both). I looked at Bucardo in detail, and I was rather disappointed to see
that it only supports two master nodes at the moment.
Thanks.
Gordan
---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings