Re: Postgresql replication

William Yu <wyu@xxxxxxxxxxx> · Thu, 25 Aug 2005 05:24:53 -0700

Another tidbit I'd like to add. What has helped a lot in implementing 
high-latency master-master replication writing our software with a 
business process model in mind where data is not posted directly to the 
final tables. Instead, users are generally allowed to enter anything -- 
could be incorrect, incomplete or the user does not have rights -- the 
data is still dumped into "pending" tables for people with rights to 
fix/review/approve later. Only after that process is the data posted to 
the final tables. (Good data entered on the first try still gets pended 
-- validation phase simply assumes the user who entered the data is also 
the one who fixed/reviewed/approved.)

In terms of replication, this model allows for users to enter data on 
any server. The pending records then get replicated to every server. 
Each specific server then looks at it's own dataset of pendings to post 
to final tables. Final data is then replicated back to all the 
participating servers.

There may be a delay for the user if he/she is working on a server that 
doesn't have rights to post his data. However, the pending->post model 
gets users used to the idea of (1) entering all data in large swoop and 
validating/posting it afterwards and (2) data can/will sit in pending 
for a period of time until it is acted upon with somebody/some server 
with the proper authority. Hence users aren't expecting results to pop 
up on the screen the moment they press the submit button.

William Yu wrote:
Yes, it requires a lot foresight to do multi-master replication -- 
especially across high latency connections. I do that now for 2 
different projects. We have servers across the country replicating data 
every X minutes with custom app logic resolves conflicting data.

Allocation of unique IDs that don't collide across servers is a must. 
For 1 project, instead of using numeric IDs, we using CHAR and 
pre-append a unique server code so record #1 on server A is A0000000001 
versus ?x0000000001 on other servers. For the other project, we were too 
far along in development to change all our numerics into chars so we 
wrote custom sequence logic to divide our 10billion ID space into 
1-Xbillion for server 1, X-Ybillion for server 2, etc.

With this step taken, we then had to isolate (1) transactions could run 
on any server w/o issue (where we always take the newest record), (2) 
transactions required an amalgam of all actions and (3) transactions had 
to be limited to "home" servers. Record keeping stuff where we keep a 
running history of all changes fell into the first category. It would 
have been no different than 2 users on the same server updating the same 
object at different times during the day. Updating of summary data fell 
into category #2 and required parsing change history of individual 
elements. Category #3 would be financial transactions requiring strict 
locks were be divided up by client/user space and restricted to the 
user's home server. This case would not allow auto-failover. Instead, it 
would require some prolonged threshold of downtime for a server before 
full financials are allowed on backup servers.

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
      choose an index scan if your joining column's datatypes do not
      match