This is great. Thank you both for thinking about my situation deeper. The secondary node in Ireland is purely for DR purposes only. It will be a read only copy that can be referred to only in the event that its absolutely needed. I will investigate the asynchronous methods more deeply. I came across this matrix with some good information to go from: https://wiki.postgresql.org/wiki/Replication,_Clustering,_and_Connection_Pooling#Features_in_the_Core_of_PostgreSQL Thanks, Alex > On Apr 30, 2015, at 6:03 PM, Fabio Ugo Venchiarutti <fabio@xxxxxxxx> wrote: > > Point taken. William is right. > > > My recommendations were unusually pessimistic as I didn't take enough time to assess global+instantaneous data changes visibility requirements on your part. > > > If the cluster is ONLY HA and you don't need to read fresh data off secondary nodes (E.G.: HA+read load balancing), asynchronous is good enough in most cases. > > > > On 01/05/15 06:37, William Dunn wrote: >> Alex, >> Note that you should be weary of suggestions to make your replication >> synchronous. Synchronous replication is rarely used for this kind of use >> case (Cisco Jabber) where the most complete durability of the standby is >> not of the utmost concern (as it would be in a banking application). Not >> only will it decrease performance, but since you expect to have only one >> local standby it could actually decrease your availability because if >> your standby went down no transactions would be able to commit on the >> master. See the Synchronous Replication section of the docs for more >> details (http://www.postgresql.org/docs/devel/static/warm-standby.html). >> >> Also note that the suggestion provided by Fabio that you should not have >> your application commit more than one transaction per user operation is >> only applicable in synchronous replication (though since this is for a >> Cisco Jabber, where you neither have control over nor much information >> regarding the number of commits sent by the transaction per user >> operation, that suggestion is not applicable anyway...). In the case of >> asynchronous master-slave replication the typical issue with streaming >> replication latency is that you have transactions going to the master >> and then the application sends a read only transaction to the slave >> before the slave receives the transaction. So long as you don't have the >> application consider the user operation completed before all the >> transactions are committed I don't think having multiple transactions >> would make your replication latency issue any less. >> >> For example, if you had a calendar application where a user enters >> event details and creates an event for the calendar. The application >> may be set up to execute 2 transactions, 1) Add the event and >> details to the calendar events table and 2) once the event creation >> transaction returns add the current user as an attendee for that >> event. In this case both transactions would be going against the >> master, so how far the slave is behind wouldn't be a factor. Of >> course it would be faster overall to send the inserts as a single >> database procedure, but that all goes against the master database so >> the streaming replication is not a factor in that consideration. >> >> >> *William J. Dunn* >> _willjdunn.com <http://willjdunn.com>_ >> >> *William J. Dunn* >> *P* 978-844-4427 | _dunnwjr@xxxxxxxxx <mailto:dunnwjr@xxxxxxxxx>_ >> _dunnw@xxxxxx <mailto:dunnw@xxxxxx>_ >> >> On Thu, Apr 30, 2015 at 9:02 AM, Fabio Ugo Venchiarutti <fabio@xxxxxxxx >> <mailto:fabio@xxxxxxxx>> wrote: >> >> > WAN delays can cause problems for any replication system; you just have >> > to be aware of that and not push things too hard (or try and violate the >> > laws of physics). For example, streaming replication set to be >> > synchronous crossing the planet is something you'd probably be rather >> > unhappy with. :) >> >> >> In my experience streaming replication fits most use cases due to >> inherent its simplicity and robustness, but you might need to adjust >> your software design to get the best out of it. >> >> >> More specifically, latency issues can be heavily mitigated by having >> application software commit no more than one transaction per user >> operation, provided 1 x "master<->sync_slave round trip time" is >> acceptable delay when they submit forms or the like. >> >> It can get much worse if the application server is on a different >> geographical node than the DB master. In such case it is >> realistically beneficial to batch multiple write operations in a >> single STATEMENT instead. >> If the replication synchronous slave is on yet another node, the >> best case (single statement) scenario would be 2 x round trip time. >> This configuration is more common than you might think as some >> setups feature remote app servers reading off synchronous slaves at >> their own physical location but committing against a master that is >> somewhere else. >> >> >> Cheers >> >> >> >> >> >> >> >> On 30/04/15 11:06, Jim Nasby wrote: >> >> On 4/29/15 1:13 PM, Alex Gregory wrote: >> >> I was thinking that I could use Slony but then I read that >> it does not >> like WAN replication. I have also read about streaming >> replication >> native to Postgres but was not sure how that would work over >> the WAN. >> Bucardo seems better for Data Warehousing or multimaster >> situations >> which this is not. That leaves pgpool ii which seems like >> it would >> add an extra layer of complexity. >> >> >> WAN delays can cause problems for any replication system; you >> just have >> to be aware of that and not push things too hard (or try and >> violate the >> laws of physics). For example, streaming replication set to be >> synchronous crossing the planet is something you'd probably be >> rather >> unhappy with. :) >> >> I haven't played with Slony in forever, but when I did it loved >> to lock >> things. That would not play well with high latency. >> >> I have run londiste between sites within the same city, and that >> worked >> well. >> >> Bucardo and pg_pool are both based on the idea of replaying SQL >> statements instead of replicating actual data. They have their >> uses, but >> I personally distrust that idea, especially for DR. >> >> When it comes down to to there are so many choices I am not >> sure if I >> need one or a combination of two. Any help you could >> provide could >> be greatly appreciated. >> >> >> If you want to replicate within a data center then streaming >> replication >> is pretty nice, and as a bonus you might be able to do >> synchronous as >> well. The downside to streaming rep is that it's binary, so if >> you ever >> suffer data corruption you're practically guaranteed that corruption >> will end up on the replica. Logical replication like londiste or >> Slony >> are much more robust against that. You also can't use temporary >> tables >> with streaming rep, and you have to replicate the details of ALL >> activity, including maintenance like VACUUM. In some >> environments that >> might be slower than logical replication. >> >> >> >> -- >> Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx >> <mailto:pgsql-general@xxxxxxxxxxxxxx>) >> To make changes to your subscription: >> http://www.postgresql.org/mailpref/pgsql-general >> >>
Attachment:
signature.asc
Description: Message signed with OpenPGP using GPGMail