Re: ceph-create-keys hung

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Joao,
I tried using the latest ceph-deploy (1.2.6) and latest dumpling release too (0.67.4). I am getting following messages during monitor creation on RHEL 6.4.

2013-10-07 13:54:09,864 [ceph_deploy.new][DEBUG ] Creating new cluster named ceph
2013-10-07 13:54:09,864 [ceph_deploy.new][DEBUG ] Resolving host jul
2013-10-07 13:54:10,123 [ceph_deploy.new][DEBUG ] Monitor jul at 15.213.24.231
2013-10-07 13:54:10,124 [ceph_deploy.new][DEBUG ] Resolving host dec
2013-10-07 13:54:10,383 [ceph_deploy.new][DEBUG ] Monitor dec at 15.213.24.241
2013-10-07 13:54:10,383 [ceph_deploy.new][DEBUG ] Resolving host julilo
2013-10-07 13:54:10,643 [ceph_deploy.new][DEBUG ] Monitor julilo at 15.213.24.230
2013-10-07 13:54:10,643 [ceph_deploy.new][DEBUG ] Monitor initial members are ['jul', 'dec', 'julilo']
2013-10-07 13:54:10,644 [ceph_deploy.new][DEBUG ] Monitor addrs are ['15.213.24.231', '15.213.24.241', '15.213.24.230']
2013-10-07 13:54:10,644 [ceph_deploy.new][DEBUG ] Creating a random mon key...
2013-10-07 13:54:10,644 [ceph_deploy.new][DEBUG ] Writing initial config to ceph.conf...
2013-10-07 13:54:10,645 [ceph_deploy.new][DEBUG ] Writing monitor keyring to ceph.mon.keyring...
2013-10-07 13:56:40,271 [ceph_deploy.mon][DEBUG ] Deploying mon, cluster ceph hosts dec jul julilo
2013-10-07 13:56:40,272 [ceph_deploy.mon][DEBUG ] detecting platform for host dec ...
2013-10-07 13:56:40,272 [ceph_deploy.sudo_pushy][DEBUG ] will use a remote connection without sudo
2013-10-07 13:56:43,909 [ceph_deploy.mon][INFO  ] distro info: RedHatEnterpriseServer 6.4 Santiago
2013-10-07 13:56:43,909 [dec][DEBUG ] determining if provided host has same hostname in remote
2013-10-07 13:56:43,918 [dec][DEBUG ] deploying mon to dec
2013-10-07 13:56:43,925 [dec][DEBUG ] remote hostname: dec
2013-10-07 13:56:43,942 [dec][INFO  ] write cluster configuration to /etc/ceph/{cluster}.conf
2013-10-07 13:56:43,988 [dec][INFO  ] creating path: /var/lib/ceph/mon/ceph-dec
2013-10-07 13:56:43,994 [dec][DEBUG ] checking for done path: /var/lib/ceph/mon/ceph-dec/done
2013-10-07 13:56:44,003 [dec][DEBUG ] done path does not exist: /var/lib/ceph/mon/ceph-dec/done
2013-10-07 13:56:44,013 [dec][INFO  ] creating keyring file: /var/lib/ceph/tmp/ceph-dec.mon.keyring
2013-10-07 13:56:44,030 [dec][INFO  ] create the monitor keyring file
2013-10-07 13:56:44,061 [dec][INFO  ] Running command: ceph-mon --cluster ceph --mkfs -i dec --keyring /var/lib/ceph/tmp/ceph-dec.mon.keyring
2013-10-07 13:56:44,192 [dec][INFO  ] ceph-mon: mon.noname-b 15.213.24.241:6789/0 is local, renaming to mon.dec
2013-10-07 13:56:44,193 [dec][INFO  ] ceph-mon: set fsid to 00255b71-7272-4ff0-8b56-60d27f2c7596
2013-10-07 13:56:44,193 [dec][INFO  ] ceph-mon: created monfs at /var/lib/ceph/mon/ceph-dec for mon.dec
2013-10-07 13:56:44,201 [dec][INFO  ] unlinking keyring file /var/lib/ceph/tmp/ceph-dec.mon.keyring
2013-10-07 13:56:44,224 [dec][INFO  ] create a done file to avoid re-doing the mon deployment
2013-10-07 13:56:44,272 [dec][INFO  ] create the init path if it does not exist
2013-10-07 13:56:44,303 [dec][INFO  ] locating `service` executable...
2013-10-07 13:56:44,314 [dec][INFO  ] found `service` executable: /sbin/service
2013-10-07 13:56:47,701 [dec][INFO  ] Running command: /sbin/service ceph start mon.dec
2013-10-07 13:56:48,231 [dec][DEBUG ] === mon.dec ===
2013-10-07 13:56:48,232 [dec][DEBUG ] Starting Ceph mon.dec on dec...
2013-10-07 13:56:48,232 [dec][DEBUG ] Starting ceph-create-keys on dec...
2013-10-07 13:56:55,232 [dec][WARNING] No data was received after 7 seconds, disconnecting...
2013-10-07 13:57:00,519 [dec][INFO  ] Running command: ceph daemon mon.dec mon_status
2013-10-07 13:57:00,837 [dec][DEBUG ] ********************************************************************************
2013-10-07 13:57:00,838 [dec][DEBUG ] status for monitor: mon.dec
2013-10-07 13:57:00,838 [dec][DEBUG ] { "name": "dec",
2013-10-07 13:57:00,838 [dec][DEBUG ]   "rank": 0,
2013-10-07 13:57:00,838 [dec][DEBUG ]   "state": "probing",
2013-10-07 13:57:00,838 [dec][DEBUG ]   "election_epoch": 0,
2013-10-07 13:57:00,839 [dec][DEBUG ]   "quorum": [],
2013-10-07 13:57:00,839 [dec][DEBUG ]   "outside_quorum": [
2013-10-07 13:57:00,839 [dec][DEBUG ]         "dec"],
2013-10-07 13:57:00,839 [dec][DEBUG ]   "extra_probe_peers": [
2013-10-07 13:57:00,839 [dec][DEBUG ]         "15.213.24.230:6789\/0",
2013-10-07 13:57:00,840 [dec][DEBUG ]         "15.213.24.231:6789\/0"],
2013-10-07 13:57:00,840 [dec][DEBUG ]   "sync_provider": [],
2013-10-07 13:57:00,840 [dec][DEBUG ]   "monmap": { "epoch": 0,
2013-10-07 13:57:00,840 [dec][DEBUG ]       "fsid": "00255b71-7272-4ff0-8b56-60d27f2c7596",
2013-10-07 13:57:00,840 [dec][DEBUG ]       "modified": "0.000000",
2013-10-07 13:57:00,841 [dec][DEBUG ]       "created": "0.000000",
2013-10-07 13:57:00,841 [dec][DEBUG ]       "mons": [
2013-10-07 13:57:00,841 [dec][DEBUG ]             { "rank": 0,
2013-10-07 13:57:00,841 [dec][DEBUG ]               "name": "dec",
2013-10-07 13:57:00,841 [dec][DEBUG ]               "addr": "15.213.24.241:6789\/0"},
2013-10-07 13:57:00,842 [dec][DEBUG ]             { "rank": 1,
2013-10-07 13:57:00,842 [dec][DEBUG ]               "name": "jul",
2013-10-07 13:57:00,842 [dec][DEBUG ]               "addr": "0.0.0.0:0\/1"},
2013-10-07 13:57:00,842 [dec][DEBUG ]             { "rank": 2,
2013-10-07 13:57:00,842 [dec][DEBUG ]               "name": "julilo",
2013-10-07 13:57:00,842 [dec][DEBUG ]               "addr": "0.0.0.0:0\/2"}]}}
2013-10-07 13:57:00,843 [dec][DEBUG ]
2013-10-07 13:57:00,843 [dec][DEBUG ] ********************************************************************************
2013-10-07 13:57:00,843 [dec][INFO  ] monitor: mon.dec is running

Regards,
Abhay


On Tue, Oct 8, 2013 at 10:10 AM, Abhay Sachan <abhaysac@xxxxxxxxx> wrote:
Hi Joao,
Thanks for replying. All of my monitors are up and running and connected to each other. "ceph -s" is failing on the cluster with following error:

2013-10-07 10:12:25.099261 7fd1b948d700 -1 monclient(hunting): ERROR: missing keyring, cannot use cephx for authentication
2013-10-07 10:12:25.099271 7fd1b948d700  0 librados: client.admin initialization error (2) No such file or directory
Error connecting to cluster: ObjectNotFound

And the logs on each monitor has lots of entries like this:
NODE 1:

2013-10-07 03:58:51.153847 7ff2864c6700  0 mon.jul@0(probing).data_health(0) update_stats avail 76% total 42332700 used 7901820 avail 32280480
2013-10-07 03:59:51.154051 7ff2864c6700  0 mon.jul@0(probing).data_health(0) update_stats avail 76% total 42332700 used 7901832 avail 32280468
2013-10-07 04:00:51.154256 7ff2864c6700  0 mon.jul@0(probing).data_health(0) update_stats avail 76% total 42332700 used 7901828 avail 32280472

NODE 2:
2013-10-07 10:12:10.345491 7fa6145b0700  0 mon.dec@0(probing).data_health(0) update_stats avail 75% total 42332700 used 8199004 avail 31983296
2013-10-07 10:13:10.345677 7fa6145b0700  0 mon.dec@0(probing).data_health(0) update_stats avail 75% total 42332700 used 8199004 avail 31983296
2013-10-07 10:14:10.345921 7fa6145b0700  0 mon.dec@0(probing).data_health(0) update_stats avail 75% total 42332700 used 8199024 avail 31983276

NODE 3:

2013-10-07 10:13:00.880250 7fcd6459e700  0 mon.julilo@0(probing).data_health(0) update_stats avail 35% total 42332700 used 25105920 avail 150763802013-10-07 10:14:00.880470 7fcd6459e700  0 mon.julilo@0(probing).data_health(0) update_stats avail 35% total 42332700 used 25105924 avail 15076376
2013-10-07 10:15:00.880668 7fcd6459e700  0 mon.julilo@0(probing).data_health(0) update_stats avail 35% total 42332700 used 25105924 avail 15076376

If you need some other logs, then please tell me how to enable/fetch them. I will upload them someplace.


Regards,
Abhay

On Thu, Oct 3, 2013 at 8:31 PM, Joao Eduardo Luis <joao.luis@xxxxxxxxxxx> wrote:
On 10/03/2013 02:44 PM, Abhay Sachan wrote:
Hi All,
I have tried setting up a ceph cluster with 3 nodes (3 monitors). I am
using RHEL 6.4 as OS with dumpling(0.67.3) release. Ceph cluster
creation (using ceph-deploy as well as mkcephfs), ceph-creates-keys
doesn't return on any of the servers. Whereas, if I create a cluster
with only 1 node (1 monitor), key creation goes through. Has anybody
seen this problem or any ideas what I might be missing??

Regards,
Abhay

Those symptoms tell me that your monitors are not forming quorum. 'ceph-create-keys' needs the monitors to first establish a quorum, otherwise it will hang waiting for that to happen.

Please make sure all your monitors are running.  If so, try running 'ceph -s' on your cluster.  If that hangs as well, try accessing each monitor's admin socket to check what's happening [1].  If that too fails, try looking into the logs for something obviously wrong.  If you are not able to discern anything useful at that point, upload the logs to some place and point us to them -- we'll then be happy to take a look.

Hope this helps.

  -Joao

--
Joao Eduardo Luis
Software Engineer | http://inktank.com | http://ceph.com


_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux