Hi, all I have a two-node Ceph cluster, and both are monitor and osd. When they're both up, osd are all up and in, everything is fine... almost: [root~]# ceph -s health HEALTH_WARN 25 pgs degraded; 316 pgs incomplete; 85 pgs stale; 24 pgs stuck degraded; 316 pgs stuck inactive; 85 pgs stuck stale; 343 pgs stuck unclean; 24 pgs stuck undersized; 25 pgs undersized; recovery 11/153 objects degraded (7.190%) monmap e1: 2 mons at {server_b=10.???.78:6789/0,server_a=10.???.80:6789/0}, election epoch 14, quorum 0,1 server_b,server_a osdmap e116375: 22 osds: 22 up, 22 in pgmap v238656: 576 pgs, 2 pools, 224 MB data, 59 objects 56175 MB used, 63420 GB / 63475 GB avail 11/153 objects degraded (7.190%) 15 active+undersized+degraded 75 stale+active+clean 2 active+remapped 158 active+clean 10 stale+active+undersized+degraded 316 incomplete But if I bring down one server, the whole cluster seems not functioning any more: [root~]# ceph -s 2015-03-31 10:32:43.848125 7f57e4105700 0 -- :/1017540 >> 10.???.78:6789/0 pipe(0x7f57e0027120 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x7f57e00273b0).fault This should not happen...Any thoughts? _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com