Search Postgresql Archives

Re: Replication Using Triggers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On Fri, 18 Jan 2008, Erik Jones wrote:

Is there any reason why PostgreSQL replication solutions are all add-on 3rd party ones?

Because no one solution would be appropriate for everyone. The core team and contributors feel that their time is better spent on the database itself rather than developing and maintaining multiple different replication solutions and dealing with the support thereof. What has been done is to add some extra hooks in 8.3 for replication triggers that can help to specialize when/if a given trigger fires.

Hmm, selective trigger firing sounds interesting.

Is there any reason why replication couldn't be implemented using triggers and a handful of stored procedures?

That's usually how it's done. Well, plus some external user-land application libraries.

That's just it - I don't think any user-land libraries would actually be required. One of supposed big advantages of MySQL is it's straightforward replication support. It's quite painful to see PostgreSQL suffer purely for the sake of lack of marketting in this department. :-(

This is what I have in mind:

Have a plperl function that creates connections to all servers in the cluster (replication partners), and issues the supplied write query to them, possibly with a tag of some sort to indicated it is a replicated query (to prevent circular replication).

Have a post execute trigger that calls the above replication function if the query was issued directly (as opposed to replicated), and passes it the query it just executed if it was successful.

If the replication failed on any node, the whole thing gets rolled back.

That sounds pretty brittle. Do you really want all progress in your databases to stop if there is a network issue to a single server?

1) That's what MySQL does (it either ignores errors or stops replication on encountering an error, which of those two it does is selectable, but that's about it).

2) If there is a network/server issue, that would be detected because the $dbh would break. If the $dbh breaks, then plperl can either attempt to re-connect, or failing that, boot the node from the replication cluster (could have a node list with active/fenced flag in a separate config schema). Neither approach would be difficult to implement.

Then it could just not bother reconnecting the fenced node until the user updates the node status in the said config schema table. Configuration schema could also be replicated.

This would effectively give star topology synchronous replication with very little effort, and no need for any external code. Am I missing something obvious that would prevent this from working? It would give replication capabilities better than MySQL's (which can only handle ring based multi-master replication) for the sake of about 100 lines of code. None of the required functionality required is new to PostgreSQL, either.

Is there an existing implementation of this? Perhaps a perl program that creates the required triggers and stored procedures from looking at a schema?

What you've described here would be pretty simple to implement. However, I think you've greatly underestimated the performance issues involved. If you need to push data to multiple databases before each transaction commits I think you'll find that pretty slow.

Only if transactions are used. I'm basing the requirements on "at least as good as MySQL", which this would meet without transactions. If transactions are wanted they could be enabled, otherwise it could just be fire and forget asynchronous replication a-la MySQL. Having a choice between transactions and speed is good. :-)

One thing I haven't quite thought of a good way to do with this approach is the equivalent of the useful (albeit deprecated) LOAD DATA FROM MASTER command, that gets the server in sync by dropping and re-loading all the tables from the master(s) (or rather, peers in a multi-master star replication), and enables it in the replication. It would be neater than requiring downtime or global write locks. But I guess that could wait until version 2. :)

That's why most of the available third party solutions are asynchronous. The biggest options are out there are Slony and Londiste (both master-slave, asynchronous) and Bucardo (asynchronous, but supports both master-master and master-slave) which, as you would have it, is written in Perl.

I looked at all of the above, and they all seemed (to meat least) to involve unnecessary complication or limitations I saw as unreasonable (or both). I looked at Bucardo in detail, and I was rather disappointed to see that it only supports two master nodes at the moment.

Thanks.

Gordan

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux