Hi Alvaro: From the code , I see unsigned need = monmap->size() / 2 + 1; So for 2 mons , the quorum must be 2 so that it can start election. That's why I use 3 mons. I know if I stop mon.0 or mon.1 , everything will work fine. And if this failure happens, it must be handled by human ? Is there any way to handle it automaticly from design as you know ? On Tue, Jul 4, 2017 at 2:25 PM, Alvaro Soto <alsotoes@xxxxxxxxx> wrote: > Z, > You are forcing a byzantine failure, the paxos implemented to form the > consensus ring of the mon daemons does not support this kind of failures, > that is why you get and erratic behaviour, I believe is the common paxos > algorithm implemented in mon daemon code. > > If you just gracefully shutdown a mon daemon everything will work fine, but > with this you can not prove a split brain situation, because you will force > the election of the leader by quorum. > > Maybe with 2 mon daemons and closing the communication between each of them > every mon daemon will believe that can be a leader because every daemon will > have the que quorum of 1 with no other vote. > > Just saying :) > > > On Jul 4, 2017 12:57 AM, "Z Will" <zhao6305@xxxxxxxxx> wrote: >> >> Hi: >> I am testing ceph-mon brain split . I have read the code . If I >> understand it right , I know it won't be brain split. But I think >> there is still another problem. My ceph version is 0.94.10. And here >> is my test detail : >> >> 3 ceph-mons , there ranks are 0, 1, 2 respectively.I stop the rank 1 >> mon , and use iptables to block the communication between mon 0 and >> mon 1. When the cluster is stable, start mon.1 . I found the 3 >> monitors will all can not work well. They are all trying to call new >> leader election . This means the cluster can't work anymore. >> >> Here is my analysis. Because mon will always respond to leader >> election message, so , in my test, communication between mon.0 and >> mon.1 is blocked , so mon.1 will always try to be leader, because it >> will always see mon.2, and it should win over mon.2. Mon.0 should >> always win over mon.2. But mon.2 will always responsd to the election >> message issued by mon.1, so this loop will never end. Am I right ? >> >> This should be a problem? Or is it was just designed like this , and >> should be handled by human ? >> _______________________________________________ >> ceph-users mailing list >> ceph-users@xxxxxxxxxxxxxx >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html