Hi guys, thank you very much for your
feedback. I'm new to Ceph, so I ask you to be patient with my
newbie-ness.
I'm dealing with the same issue although I'm not using ceph-deploy. I installed manually (for learning purposes) a small test cluster of three nodes, one to host the single mon and two for osd. I had managed to get this working, all seemed healthy. I then simulated a catastrophic event by pulling the plug on all three nodes. After that I haven't been able to get things working. There is no quorum reached on a single mon setup and a ceph-create-keys process is hanging hanging. This is my ceph.conf This is my ceph.conf http://pastebin.com/qyqeu5E4 This is what a process list pertaining to ceph looks like on the mon node after a reboot, please note that the ceph-create-keys hangs: root@ceph0:/var/log/ceph# ps aux | grep ceph root 988 0.2 0.2 34204 7368 ? S 15:36 0:00 /usr/bin/python /usr/sbin/ceph-create-keys -i cehp0 root 1449 0.0 0.1 94844 3972 ? Ss 15:38 0:00 sshd: ceph [priv] ceph 1470 0.0 0.0 94844 1740 ? S 15:38 0:00 sshd: ceph@pts/0 ceph 1471 0.3 0.1 22308 3384 pts/0 Ss 15:38 0:00 -bash root 1670 0.0 0.0 9452 904 pts/0 R+ 15:38 0:00 grep --color=auto ceph So as you can see, no mon process is started, I presume that this is somehow a result of the ceph-create-keys process hanging. /var/log/ceph-mon.cehp0.log shows the following in this status of the system, after a reboot: 2014-01-09 15:49:44.433943 7f9e45eb97c0 0 ceph version 0.72.2 (a913ded2ff138aefb8cb84d347d72164099cfd60), process ceph-mon, pid 972 2014-01-09 15:49:44.535436 7f9e45eb97c0 -1 failed to create new leveldb store If I manually start the ceph process by: start ceph-mon id=ceph0 it starts fine, and "ceph --admin-daemon=/var/run/ceph/ceph-mon.ceph0.asok mon_status" outputs: { "name": "ceph0", "rank": 0, "state": "leader", "election_epoch": 1, "quorum": [ 0], "outside_quorum": [], "extra_probe_peers": [], "sync_provider": [], "monmap": { "epoch": 1, "fsid": "e0696edf-ac8d-4095-beaf-6a2592964060", "modified": "2014-01-08 02:00:23.264895", "created": "2014-01-08 02:00:23.264895", "mons": [ { "rank": 0, "name": "ceph0", "addr": "192.168.10.200:6789\/0"}]}} The mon process seems ok, but the ceph-create-keys keeps hanging and there is no quorum. If I kill the ceph-create-keys process and run "/usr/bin/python /usr/sbin/ceph-create-keys -i cehp0" manually i get: "admin_socket: exception getting command descriptions: [Errno 2] No such file or directory INFO:ceph-create-keys:ceph-mon admin socket not ready yet." every second or so. This is what happens when I terminate the manually started ceph-create-keys process: ^CTraceback (most recent call last): File "/usr/sbin/ceph-create-keys", line 227, in <module> main() File "/usr/sbin/ceph-create-keys", line 213, in main wait_for_quorum(cluster=args.cluster, mon_id=args.id) File "/usr/sbin/ceph-create-keys", line 34, in wait_for_quorum time.sleep(1) KeyboardInterrupt I will finish this long post by pasting what happens if I try to restart all services on the cluster, just so you know that the mon problem is only the first problem I'm battling with here :) http://pastebin.com/mPGhiYu5 Please note, that after the above global restart, the ceph-create-keys hanging process is back. Best, Moe On 01/09/2014 09:51 AM, Travis Rhoden wrote: On Thu, Jan 9, 2014 at 9:48 AM, Alfredo Deza <alfredo.deza@xxxxxxxxxxx> wrote:On Thu, Jan 9, 2014 at 9:45 AM, Travis Rhoden <trhoden@xxxxxxxxx> wrote:HI Mordur, I'm definitely straining my memory on this one, but happy to help if I can? I'm pretty sure I did not figure it out -- you can see I didn't get any feedback from the list. What I did do, however, was uninstall everything and try the same setup with mkcephfs, which worked fine at the time. This was 8 months ago, though, and I have since used ceph-deploy many times with great success. I am not sure if I have ever tried a similar set up, though, with just one node and one monitor. Fortuitiously, I may be trying that very setup today or tomorrow. If I still have issues, I will be sure to post them here. Are you using both the latest ceph-deploy and the latest Ceph packages (Emperor or newer dev packages)? There have been lots of changes in the monitor area, including in the upstart scripts, that made many things more robust in this area. I did have a cluster a few months ago that had a flaky monitor that refused to join quorum after install, and I had to just blow it away and re-install/deploy it and then it was fine, which I thought was odd. Sorry that's probably not much help. - Travis On Thu, Jan 9, 2014 at 12:40 AM, Mordur Ingolfsson <rass@xxxxxxx> wrote:Hi Travis, Did you figure this out? I'm dealing with exactly the same thing over here.Can you share what exactly you are having problems with? ceph-deploy's log output has been much improved and it is super useful to have that when dealing with possible issues.I do not, it was long long ago... And it case it was ambiguous, let me explicitly say I was not recommending the use of mkcephfs at all (is that even still possible?). ceph-deploy is certainly the tool to use.Best, Moe _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com |
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com