Re: ceph monitor crashes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 15 May 2013, Matt Chipman wrote:
> 
> Hi,
> You mention early that one mon is spare but later say that you have 4 mons
> running. Ceph explicitly requires an odd number of monitors so a quorum can
> be established.

Quick correction: you can run any number of monitors, but even numbers are 
not recommended because there is little benefit in availability.  3 mons 
means a majority of 2 can form quorum (tolerate 1 failure), while 4 mons 
requires 3 up (still tolerating only 1 failure).  At the margin you are 
usually better off rounding down to an odd number.  But even-sized 
clusters will still work.

sage


> 
> I think you should have 3 or 5 mons running not 3 and a spare.
> 
> Good luck
> 
> On 14/05/2013 12:58 PM, "Mr. NPP" <mr.npp@xxxxxxxxxxxxxxxxxxx> wrote:
>       hello, i'm currently running 0.61, with about 44 osd's and 4
>       monitors, one as a spare.
> with about 6 hosts.
> 
> I've been running into an issue where when one ceph host would go down
> the entire system become unusable. today we recovered from a ssd crash
> crash for an osd's journal, and it was a lot of work to get it back
> up, i couldn't get monitors to come up and establish quorum. I was
> going to rebuild it manually, but the documentation for ceph is
> outdated to manually (dirty) remove a monitor using the monmap tool, i
> couldn't find the /mon-$id/monmap directory.
> 
> anyway, I recovered eventually and was able to run with 4 monitors,
> and i updated the crushmap and it crashed the monitor that i was
> updating the crushmap too.
> 
> it now gives me
> 
> [976]: (33) Numerical argument out of domain
> 
> when i try to manually start it, i've seen this assert failure before,
> just not sure whats causing it.
> 
> below i the log from the crash.
> https://docs.google.com/a/nopatentpending.com/file/d/0BwQnRodV8ActNTVFUVpLV
> jdMSGc/edit
> 
> i'm not even really sure if my configs are right, i'm still pretty new
> at this.
> 
> below are the configs, and the last map
> 
> ceph.conf
> https://docs.google.com/file/d/0BwQnRodV8Acta3ZfSnBrOU40MW8/edit?usp=sharin
> g
> 
> crush.map.txt
> https://docs.google.com/file/d/0BwQnRodV8Actbl9hY054Mm9UTXM/edit?usp=sharin
> g
> 
> if you need additional dumps from the monitor i can get it.
> 
> thanks
> mr.npp
> 
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> 
> 
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux