We have a new ceph cluster, and when I follow the guide (http://ceph.com/docs/master/start/quick-ceph-deploy/) during the section where you can add additional monitors, it fails, and it almost seems like its using an improper ip address
This all seemed fine, and after adding in all of our osd's, ceph -s reports:
We have 4 nodes:
- lts-mon
- lts-osd1
- lts-osd2
- lts-osd3
Using, ceph-deploy, we have created a new cluster with lts-mon as the initial monitor:
ceph-deploy new lts-monceph-deploy install lts-mon lts-osd1 lts-osd2 lts-osd3ceph-deploy mon create-initial
ceph-deploy osd prepare ........
ceph-deploy mds lts-mon
The only modifications I made to ceph.conf were to include the public and cluster network settings, and set the osd pool default size:
[global]fsid = 5ca0e0f5-d367-48b8-97b4-48e8b12fd517mon_initial_members = lts-monmon_host = 10.5.68.236auth_cluster_required = cephxauth_service_required = cephxauth_client_required = cephxfilestore_xattr_use_omap = trueosd_pool_default_size = 3public_network = 10.5.68.0/22cluster_network = 10.1.1.0/24
# ceph -scluster f4adbd94-bf49-42f2-bd57-ebc7db9aa863health HEALTH_WARNtoo few PGs per OSD (1 < min 30)monmap e1: 1 mons at {lts-mon=10.5.68.236:6789/0}election epoch 1, quorum 0 lts-monosdmap e471: 102 osds: 102 up, 102 inpgmap v973: 64 pgs, 1 pools, 0 bytes data, 0 objects515 GB used, 370 TB / 370 TB avail64 active+clean
We have not defined the default pg so the warning seems okay for now
The problem we have is when adding a new monitor:
ceph-deploy mon create lts-osd1[ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf[ceph_deploy.cli][INFO ] Invoked (1.5.25): /usr/local/bin/ceph-deploy mon create lts-osd1[ceph_deploy.mon][DEBUG ] Deploying mon, cluster ceph hosts lts-osd1[ceph_deploy.mon][DEBUG ] detecting platform for host lts-osd1 ...[lts-osd1][DEBUG ] connection detected need for sudo[lts-osd1][DEBUG ] connected to host: lts-osd1[lts-osd1][DEBUG ] detect platform information from remote host[lts-osd1][DEBUG ] detect machine type[ceph_deploy.mon][INFO ] distro info: Ubuntu 14.04 trusty[lts-osd1][DEBUG ] determining if provided host has same hostname in remote[lts-osd1][DEBUG ] get remote short hostname[lts-osd1][DEBUG ] deploying mon to lts-osd1[lts-osd1][DEBUG ] get remote short hostname[lts-osd1][DEBUG ] remote hostname: lts-osd1[lts-osd1][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf[lts-osd1][DEBUG ] create the mon path if it does not exist[lts-osd1][DEBUG ] checking for done path: /var/lib/ceph/mon/ceph-lts-osd1/done[lts-osd1][DEBUG ] create a done file to avoid re-doing the mon deployment[lts-osd1][DEBUG ] create the init path if it does not exist[lts-osd1][DEBUG ] locating the `service` executable...[lts-osd1][INFO ] Running command: sudo initctl emit ceph-mon cluster=ceph id=lts-osd1[lts-osd1][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.lts-osd1.asok mon_status[lts-osd1][DEBUG ] ********************************************************************************[lts-osd1][DEBUG ] status for monitor: mon.lts-osd1[lts-osd1][DEBUG ] {[lts-osd1][DEBUG ] "election_epoch": 0,[lts-osd1][DEBUG ] "extra_probe_peers": [[lts-osd1][DEBUG ] "10.5.68.236:6789/0"[lts-osd1][DEBUG ] ],[lts-osd1][DEBUG ] "monmap": {[lts-osd1][DEBUG ] "created": "0.000000",[lts-osd1][DEBUG ] "epoch": 0,[lts-osd1][DEBUG ] "fsid": "5ca0e0f5-d367-48b8-97b4-48e8b12fd517",[lts-osd1][DEBUG ] "modified": "0.000000",[lts-osd1][DEBUG ] "mons": [[lts-osd1][DEBUG ] {[lts-osd1][DEBUG ] "addr": "0.0.0.0:0/1",[lts-osd1][DEBUG ] "name": "lts-mon",[lts-osd1][DEBUG ] "rank": 0[lts-osd1][DEBUG ] }[lts-osd1][DEBUG ] ][lts-osd1][DEBUG ] },[lts-osd1][DEBUG ] "name": "lts-osd1",[lts-osd1][DEBUG ] "outside_quorum": [],[lts-osd1][DEBUG ] "quorum": [],[lts-osd1][DEBUG ] "rank": -1,[lts-osd1][DEBUG ] "state": "probing",[lts-osd1][DEBUG ] "sync_provider": [][lts-osd1][DEBUG ] }[lts-osd1][DEBUG ] ********************************************************************************[lts-osd1][INFO ] monitor: mon.lts-osd1 is currently at the state of probing[lts-osd1][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.lts-osd1.asok mon_status[lts-osd1][WARNIN] lts-osd1 is not defined in `mon initial members`[lts-osd1][WARNIN] monitor lts-osd1 does not exist in monmap
the monitor I was trying to add shows:
2015-06-09 11:33:24.661466 7fef2a806700 0 cephx: verify_reply couldn't decrypt with error: error decoding block for decryption2015-06-09 11:33:24.661478 7fef2a806700 0 -- 10.5.68.229:6789/0 >> 10.5.68.236:6789/0 pipe(0x3571000 sd=13 :40912 s=1 pgs=0 cs=0 l=0 c=0x34083c0).failed verifying authorize reply2015-06-09 11:33:24.763579 7fef2eb83700 0 log_channel(audit) log [DBG] : from='admin socket' entity='admin socket' cmd='mon_status' args=[]: dispatch2015-06-09 11:33:24.763651 7fef2eb83700 0 log_channel(audit) log [DBG] : from='admin socket' entity='admin socket' cmd=mon_status args=[]: finished2015-06-09 11:33:25.825163 7fef2eb83700 0 log_channel(audit) log [DBG] : from='admin socket' entity='admin socket' cmd='mon_status' args=[]: dispatch2015-06-09 11:33:25.825259 7fef2eb83700 0 log_channel(audit) log [DBG] : from='admin socket' entity='admin socket' cmd=mon_status args=[]: finished2015-06-09 11:33:26.661737 7fef2a806700 0 cephx: verify_reply couldn't decrypt with error: error decoding block for decryption2015-06-09 11:33:26.661750 7fef2a806700 0 -- 10.5.68.229:6789/0 >> 10.5.68.236:6789/0 pipe(0x3571000 sd=13 :40914 s=1 pgs=0 cs=0 l=0 c=0x34083c0).failed verifying authorize reply2015-06-09 11:33:26.887973 7fef2eb83700 0 log_channel(audit) log [DBG] : from='admin socket' entity='admin socket' cmd='mon_status' args=[]: dispatch2015-06-09 11:33:26.888047 7fef2eb83700 0 log_channel(audit) log [DBG] : from='admin socket' entity='admin socket' cmd=mon_status args=[]: finished2015-06-09 11:33:27.950014 7fef2eb83700 0 log_channel(audit) log [DBG] : from='admin socket' entity='admin socket' cmd='mon_status' args=[]: dispatch2015-06-09 11:33:27.950113 7fef2eb83700 0 log_channel(audit) log [DBG] : from='admin socket' entity='admin socket' cmd=mon_status args=[]: finished
All of our google searching seems to indicate that there may be a clock skew, but, the clocks are matched within .001 seconds
Any assistance is much appreciated, thanks,
Mike C
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com