Re: Weird problem - maybe quorum related

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Can you get the quorum and related dumps out of the admin socket for
each running monitor and see what they say?
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com


On Tue, Jul 23, 2013 at 4:51 PM, Mandell Degerness
<mandell@xxxxxxxxxxxxxxx> wrote:
> One of our tests last night failed in a weird way.  We started with a
> three node cluster, with three monitors, expanded to a 5 node cluster
> with 5 monitors and dropped back to a 4 node cluster with three
> monitors.
>
> The sequence of events was:
>
> start 3 monitors (monitors 0, 1, 2) - monmap e1
> add one node
> restart the 3 monitors
> add another node
> add monitor 4 - monmap e2
> restart monitor 0
> add monitor 3 - monmap e3
> restart monitor 1
> restart monitor 2
> shutdown server with monitor 4 on it
> remove monitor 4 - monmap e4
> restart monitor 0
> mon.0 had an odd time sync problem and respawned
> stop monitor 3
> remove monitor 3
>
> At that point (08:23:52 in the log), ceph stopped responding (as if
> quorum was lost).  Note that we do not see a new monmap (e5) created
> by the removal of monitor 3.
>
> See the (sort of) full log at:
> https://gist.github.com/mdegerne/06fa38243bd462c46d39
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux