Hello to all, I've a Ceph cluster composed of 4 nodes in 2 differents rooms. room A : osd.1, osd.3, mon.a, mon.c room B : osd.2, osd.4, mon.b My crush rule is made to make replica accross rooms. So normally, if I shut the whole room A, my cluster should stay usable. ... but, in fact no. When i switch off room A, mon.b does not succeed in managing the cluster. Here is the log of mon.b : 2013-04-05 11:46:11.842267 7f42e61fc700 0 mon.b@1(peon) e1 handle_command mon_command(status v 0) v1 2013-04-05 11:46:12.746317 7f42e61fc700 0 mon.b@1(peon) e1 handle_command mon_command(status v 0) v1 2013-04-05 11:46:17.684378 7f42e46f3700 0 -- 10.0.3.2:6789/0 >> 10.0.3.1:6789/0 pipe(0x7f42d4002c80 sd=26 :6789 s=2 pgs=47 cs=1 l=0).fault, initiating reconnect 2013-04-05 11:46:17.685624 7f42f0e93700 0 -- 10.0.3.2:6789/0 >> 10.0.3.1:6789/0 pipe(0x7f42d4002c80 sd=19 :35755 s=1 pgs=47 cs=2 l=0).fault 2013-04-05 11:46:17.721214 7f4266eee700 0 -- 10.0.3.2:6789/0 >> 10.0.3.3:6789/0 pipe(0x2b4c480 sd=17 :58791 s=2 pgs=26 cs=1 l=0).fault with nothing to send, going to standby 2013-04-05 11:46:18.453162 7f42e61fc700 0 mon.b@1(peon) e1 handle_command mon_command(status v 0) v1 2013-04-05 11:46:25.638744 7f42ec80d700 0 -- 10.0.3.2:6789/0 >> 10.0.3.3:6789/0 pipe(0x2b4c480 sd=17 :58791 s=1 pgs=26 cs=2 l=0).fault What I understand is that, yes, mon.b knows that mon.a and mon.c are down, but it can't join the quorum. Why ? Thanks for your answers. _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com