William Yu wrote:
Chris Travers wrote:
Why not have the people who have rights to review this all write to
the master database and have that replicated back? It seems like
latency is not really an issue. Replication here is only going to
complicate
What master database? Having a single master defeats the purpose of
load balancing to handle more users.
I guess I am thinking along different lines than you. I was thinking
that the simplest solution would be to have master/slave replication for
*approved* transactions only and no replication for initial commits
prior to approval. This makes the assumption that a single transaction
will be committed on a single server, and that a single transaction will
not be split over multiple servers. In this way, you can commit a
pending transaction to any single server, and when it is approved, it
gets replicated via the master. See below for more.
> things. If it were me, I would be having my approval app pull data
> from *all* of the databases independently and not rely on the
> replication for this part. The replication could then be used to
> replicate *approved* data back to the slaves.
If your app client happens to have high speed access to all servers,
fine. And you can guarantee uptime connections to all servers except
for the rare cases of hardware failure. The problem is if you don't,
you end up with every transaction running at the speed of the slowest
connection between a client and the farthest DB. While the final
status of a transaction does not need to show up anytime soon on a
user's screen, there still needs to be fast response for each
individual user action.
Well... It depends on how it is implimented I guess. If you pull
transactional information in the background while the user is doing
other things, then it shouldn't matter. Besides, what should actually
happen is that your connection is only as slow as the connection to the
server which hosts the pending transaction you are trying to commit at
the moment. In this way, each request only goes to one server (the one
which has the connection). You could probably use DBI-Link and some
clever materialized views to maintain the metadata at each location
without replicating the whole transaction. You could probably even use
DBI-Link or dblink to pull the transactions in a transparent way. Or
you could replicate transactions into a pending queue dynamically...
There are all sorts of ways you could make this respond well over slow
connections. Remember, PostgreSQL allows you to separate storage from
presentation of the data, and this is quite powerful.
How bad does the response get? I've done some simple tests comparing
APP <-LAN-> DB versus APP <-cross country VPN-> DB. Even simple
actions like inserting a recording and checking for a dupe key
violation (e.g. almost no bandwidth needed) takes about 10,000 times
longer than over a 100mbit LAN.
I think you could design a database such that duplicate keys are not an
issue and only get checked on the master and then should never be a
problem.
Thinking about it.... It seems here that one ends up with a sort of
weird "multi-master" replication based on master/slave replication if
you replicate these changes in the background (via another process,
Notify, etc).
I still don't understand the purpose of replicating the pending data...
Imagine a checking account. A request to make an online payment can be
made on any server. The moment the user submits a request, he sees it
on his screen. This request is considered pending and not a true
transaction yet. All requests are collected together via replication
and the "home" server for that account then can check the account
balance and decide whether there's enough funds to issue those payments.
Thinking about this.... The big issue is that you only want to
replicate the deltas, not the entire account. I am still thinking
master/slave, but something where the deltas are replicated in the
background or where the user, in checking his account, is actually
querying the home server. This second issue could be done via dblink or
DBI-Link and would simply require that a master table linking the
accounts with home servers be replicated (this should, I think, be
fairly low-overhead).
Best Wishes,
Chris Travers
Metatron Technology Consulting
---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match