Weird problem - maybe quorum related

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



One of our tests last night failed in a weird way.  We started with a
three node cluster, with three monitors, expanded to a 5 node cluster
with 5 monitors and dropped back to a 4 node cluster with three
monitors.

The sequence of events was:

start 3 monitors (monitors 0, 1, 2) - monmap e1
add one node
restart the 3 monitors
add another node
add monitor 4 - monmap e2
restart monitor 0
add monitor 3 - monmap e3
restart monitor 1
restart monitor 2
shutdown server with monitor 4 on it
remove monitor 4 - monmap e4
restart monitor 0
mon.0 had an odd time sync problem and respawned
stop monitor 3
remove monitor 3

At that point (08:23:52 in the log), ceph stopped responding (as if
quorum was lost).  Note that we do not see a new monmap (e5) created
by the removal of monitor 3.

See the (sort of) full log at:
https://gist.github.com/mdegerne/06fa38243bd462c46d39
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux