Search Postgresql Archives

Re: Postgresql replication

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



William Yu wrote:

Chris Travers wrote:

Why not have the people who have rights to review this all write to the master database and have that replicated back? It seems like latency is not really an issue. Replication here is only going to complicate


What master database? Having a single master defeats the purpose of load balancing to handle more users.

I guess I am thinking along different lines than you. I was thinking that the simplest solution would be to have master/slave replication for *approved* transactions only and no replication for initial commits prior to approval. This makes the assumption that a single transaction will be committed on a single server, and that a single transaction will not be split over multiple servers. In this way, you can commit a pending transaction to any single server, and when it is approved, it gets replicated via the master. See below for more.



> things.  If it were me, I would be having my approval app pull data
> from *all* of the databases independently and not rely on the
> replication for this part.  The replication could then be used to
> replicate *approved* data back to the slaves.

If your app client happens to have high speed access to all servers, fine. And you can guarantee uptime connections to all servers except for the rare cases of hardware failure. The problem is if you don't, you end up with every transaction running at the speed of the slowest connection between a client and the farthest DB. While the final status of a transaction does not need to show up anytime soon on a user's screen, there still needs to be fast response for each individual user action.

Well... It depends on how it is implimented I guess. If you pull transactional information in the background while the user is doing other things, then it shouldn't matter. Besides, what should actually happen is that your connection is only as slow as the connection to the server which hosts the pending transaction you are trying to commit at the moment. In this way, each request only goes to one server (the one which has the connection). You could probably use DBI-Link and some clever materialized views to maintain the metadata at each location without replicating the whole transaction. You could probably even use DBI-Link or dblink to pull the transactions in a transparent way. Or you could replicate transactions into a pending queue dynamically... There are all sorts of ways you could make this respond well over slow connections. Remember, PostgreSQL allows you to separate storage from presentation of the data, and this is quite powerful.


How bad does the response get? I've done some simple tests comparing APP <-LAN-> DB versus APP <-cross country VPN-> DB. Even simple actions like inserting a recording and checking for a dupe key violation (e.g. almost no bandwidth needed) takes about 10,000 times longer than over a 100mbit LAN.

I think you could design a database such that duplicate keys are not an issue and only get checked on the master and then should never be a problem.

Thinking about it.... It seems here that one ends up with a sort of weird "multi-master" replication based on master/slave replication if you replicate these changes in the background (via another process, Notify, etc).



I still don't understand the purpose of replicating the pending data...


Imagine a checking account. A request to make an online payment can be made on any server. The moment the user submits a request, he sees it on his screen. This request is considered pending and not a true transaction yet. All requests are collected together via replication and the "home" server for that account then can check the account balance and decide whether there's enough funds to issue those payments.

Thinking about this.... The big issue is that you only want to replicate the deltas, not the entire account. I am still thinking master/slave, but something where the deltas are replicated in the background or where the user, in checking his account, is actually querying the home server. This second issue could be done via dblink or DBI-Link and would simply require that a master table linking the accounts with home servers be replicated (this should, I think, be fairly low-overhead).

Best Wishes,
Chris Travers
Metatron Technology Consulting

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
      choose an index scan if your joining column's datatypes do not
      match

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux