Disaster recovery of monitor

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

We ran into a problem with our test cluster after adding monitors. It now seems that our main monitor doesn't want to start anymore. The logs are flooded with:

2013-06-13 11:41:05.316982 7f7689ca4780 7 mon.a@0(leader).osd e2809 update_from_paxos applying incremental 2810 2013-06-13 11:41:05.317043 7f7689ca4780 1 mon.a@0(leader).osd e2809 e2809: 9 osds: 9 up, 9 in 2013-06-13 11:41:05.317064 7f7689ca4780 7 mon.a@0(leader).osd e2809 update_from_paxos applying incremental 2810

etc

When starting after a while we get the following error:

service ceph start mon.a
=== mon.a ===
Starting Ceph mon.a on xxxxx...
[22037]: (33) Numerical argument out of domain
failed: 'ulimit -n 8192; /usr/bin/ceph-mon -i a --pid-file /var/run/ceph/mon.a.pid -c /etc/ceph/ceph.conf '
Starting ceph-create-keys on xxxx...

Is there are disaster recovery method for monitors? This is just a test environment so I don't really care about the data but if something like this happens on a production environment I would like to know how to get it back (if at all possible).

We just upgraded to 0.61.3. Perhaps we ran into a bug. When adding the monitors we just followed this guide:

http://ceph.com/docs/next/rados/operations/add-or-rm-mons/

After adding the monitors we ran into problems and we tried to fix it with information we could find online and we started playing with monmap and I think this is where it went bad.

We are running ceph version 0.61.3 (92b1e398576d55df8e5888dd1a9545ed3fd99532)

/etc/ceph/ceph.conf is pretty simple for the monitor:

[global]
        auth supported = none
        auth cluster required = none
        auth service required = none
        auth client required = none

public network = xxx.xxx.0.0/24
cluster network = xxx.xxx.0.0/24

mon initial members = xxxxx
[osd]
        osd journal size = 1000

[mds.a]
        host = xxxxx
        devs = /dev/sdb
        mds data = /var/lib/ceph/mds/ceph-0/

[mon.a]
        host = xxxxx
        mon addr = xxx.xxx.0.25:6789
        mon data = /var/lib/ceph/mon/ceph-a

etc

Thanks for looking and if you need more info let me know.

Cheers,

Peter

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux