fail to add mon in a way of ceph-deploy or manually

朱彤 <besthopeall@xxxxxxxxxxx> · Thu, 14 Jul 2016 09:07:54 +0000

Using ceph-deploy: 
I have ceph-node1 as admin and mon, and I would like to add another mon ceph-node2.
On ceph-node1:
ceph-deploy mon create ceph-node2
ceph-deploy mon add ceph-node2

The fisrt command warns: 

[ceph-node2][WARNIN] ceph-node2 is not defined in `mon initial members`
[ceph-node2][WARNIN] monitor ceph-node2 does not exist in monmap

The second command warns and throws errors:

[ceph-node2][WARNIN] IO error: lock /var/lib/ceph/mon/ceph-ceph-node2/store.db/LOCK: Resource temporarily unavailable
[ceph-node2][WARNIN] 2016-07-14 16:25:14.838255 7f6177f724c0 -1 asok(0x7f6183ef4000) AdminSocketConfigObs::init: failed: AdminSocket::bind_and_listen: failed to bind the UNIX domain socket to '/var/run/ceph/ceph-mon.ceph-node2.asok': (17) File exists
[ceph-node2][WARNIN] 2016-07-14 16:25:14.844003 7f6177f724c0 -1 error opening mon data directory at '/var/lib/ceph/mon/ceph-ceph-node2': (22) Invalid argument
[ceph-node2][ERROR ] RuntimeError: command returned non-zero exit status: 1
[ceph_deploy.mon][ERROR ] Failed to execute command: ceph-mon -i ceph-node2 --pid-file /var/run/ceph/mon.ceph-node2.pid --public-addr 192.168.57.103
[ceph_deploy][ERROR ] GenericError: Failed to add monitor to host:  ceph-node2

Now status is :

[root@ceph-node1 ceph]# ceph status
    cluster eee6caf2-a7c6-411c-8711-a87aa4a66bf2
     health HEALTH_ERR
            64 pgs are stuck inactive for more than 300 seconds
            64 pgs degraded
            64 pgs stuck inactive
            64 pgs undersized
            too few PGs per OSD (21 < min 30)
     monmap e1: 1 mons at {ceph-node1=192.168.57.101:6789/0}
            election epoch 8, quorum 0 ceph-node1
     osdmap e24: 3 osds: 3 up, 3 in
            flags sortbitwise
      pgmap v45: 64 pgs, 1 pools, 0 bytes data, 0 objects
            101836 kB used, 15227 MB / 15326 MB avail
                  64 undersized+degraded+peered

Using hands: 
(http://docs.ceph.com/docs/master/rados/operations/add-or-rm-mons/)
Running on ceph-node2 fails at step 3:
ceph auth get mon. -o {tmp}/{key-filename} 

error: 

2016-07-14 16:26:52.469722 7f706bff7700  0 -- :/1183426694 >> 192.168.57.101:6789/0 pipe(0x7f707005c7d0 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x7f707005da90).fault
2016-07-14 16:26:55.470789 7f706bef6700  0 -- :/1183426694 >> 192.168.57.101:6789/0 pipe(0x7f7060000c80 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x7f7060001f90).fault

So I ran the above command on ceph-node1, and then scp key and map to host ceph-node2.
Then I completes the next procedures "successfully". But running ceph status on ceph-node2:

[root@ceph-node2 ~]# ceph status
2016-07-14 17:01:30.134496 7f43f8164700  0 -- :/2056484158 >> 192.168.57.101:6789/0 pipe(0x7f43f405c7d0 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x7f43f405da90).fault
2016-07-14 17:01:33.136259 7f43efd77700  0 -- :/2056484158 >> 192.168.57.101:6789/0 pipe(0x7f43e4000c80 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x7f43e4001f90).fault

And on ceph-node1, ceph status shows there is only one mon.

monmap e1: 1 mons at {ceph-node1=192.168.57.101:6789/0}
            election epoch 8, quorum 0 ceph-node1

This is ceph.conf:

[global]
fsid = eee6caf2-a7c6-411c-8711-a87aa4a66bf2
mon_initial_members = ceph-node1
mon_host = 192.168.57.101
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
filestore_xattr_use_omap = true
public_network = 192.168.57.0/24

 I use root account during the
 whole ceph build-up procedure.

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com