Hi, I don't like to promote my own website on the mailing but this article tries to address your problem with a possible answer :-) http://www.sebastien-han.fr/blog/2013/01/28/ceph-geo-replication-sort-of/ -- Regards, Sébastien Han. On Thu, Feb 14, 2013 at 2:03 PM, Joao Eduardo Luis <joao.luis@xxxxxxxxxxx> wrote: > On 02/14/2013 12:24 PM, Simon Leinen wrote: >> >> Wolfgang Hennerbichler writes: >>> >>> I have a ceph cluster on 2 sites. One site has 2 mons, the other site >>> has 1 mon. [...] >> >> >> As Martin wrote, if you lose the site with the 2 mons, the entire >> cluster will become unavailable. >> >> Here's what I've been thinking to myself could be a nice solution: >> >> Get a third site somewhere, and move one of the currently 2 mons from >> your first site to that third site. The site only needs space and >> performance for one (1) VM running ceph-mon. Ideally it would be >> reliable, well-connected in terms of RTT etc. - but even if it isn't, >> that may not matter so much. >> >> My reasoning is that under normal conditions, the two mons in your >> "real" sites will be sufficient (quorum) to maintain consistency of the >> cluster. So even if the third-site mon is somehow "asleep at the >> wheel", that wouldn't necessarily have any noticeable impact on your >> cluster's performance. (That's pure hypothesis, I haven't tried this or >> otherwise thought this through. Please comment if you disagree!) > > > This could actually cause problems to the quorum. If one of the monitors > doesn't perform well enough, or if it gets overloaded, you may start seeing > it behaving weirdly (constantly drop out of quorum for instance), and that > might even affect the other two monitors (dropping out of quorum usually > leads to a subsequent attempt to join it, and that will trigger an > election). > > The biggest problem is still regarding latency. Given that the Paxos will > take into consideration all the monitors in the quorum for each round, if > the RTT on one of the monitors is high enough, the Paxos will start to > timeout. Timeouts make a monitor bootstrap, new election, etc. If one of > the monitors have a chronically high latency, then it may render the cluster > unusable. > > It is however possible to adjust those timeouts, but have never tried to do > it as to deal with such a scenario. It ought to do it, but am not sure what > could be the potential consequences (if any) of doing so. > > >> I guess you can hardwire the ranks of the mons to make sure that the >> third-site monitor never becomes elected as leader. > > > ranks are assigned on ip:port. If you have some freedom when assigning ips, > then it should be fairly straightforward (127.0.0.1:6789 < 127.0.0.2:6789 < > 127.0.0.2:6790) > > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com