Here's the problem... Five sites, one DB, all 5 sites have read/write to the DB. If one site goes down, the other 4 should be able to continue to work with the DB (read and write). When the dead site comes back on line, it ought to be able to rejoin the group.
If one site become isolated (not "down", just maybe network issues), it ought to go to read-only mode until it can rejoin as a member of the group. I was thinking... One master, 4 slaves. Can only write to the master (over WAN). No write transaction can be committed until it's duplicated at all the slave sites. (this, so far, is I think a standard requirement/request). Now, the "master" token can
get passed from one site to the other depending on the viability of the communications between the sites. If site A was the master but went down, the remaing 4 should be smart enough to detect this and decide who becomes the new master. If site A became
isolated, it ought to detect that it can't communicate with the other sites and that it needs to put itself into read_only mode. All this should be automatic. Would be willing to code up solutions to detect network viability, pass the "master" token around, etc... . Would like to know if a basic 5-site (1 master, 4 slaves) config is possible/sensible/viable. If
so, what sw/solution would be best? (all linux-linux, WAN access, inter-continental/global) |