Hi List, I have a Ceph setup consisting of 3 nodes, 1 mon and 2 osd. It seems that both my osds are in but down. the osd processes on the osd nodes exist and are listening and I am able to successfully telnet to - and from all nodes on the ports to and from all other nodes on the respective ports. Still, my pgs are all stuck in this status: pg 2.2 is stuck unclean since forever, current state creating, last acting [] Here is my ceph.config: http://pastebin.com/SpQA38Em and here is what what 'ceph report' has to say. http://pastebin.com/3gPJhpnH This is what the osd logs show: 2014-01-13 17:23:32.638235 7f98486a7700 5 osd.0 16 tick 2014-01-13 17:23:32.638270 7f98486a7700 10 osd.0 16 do_waiters -- start 2014-01-13 17:23:32.638273 7f98486a7700 10 osd.0 16 do_waiters -- finish 2014-01-13 17:23:32.657880 7f9836e63700 20 osd.0 16 update_osd_stat osd_stat(1057 MB used, 29646 MB avail, 30704 MB total, peers []/[] op hist []) 2014-01-13 17:23:32.657935 7f9836e63700 5 osd.0 16 heartbeat: osd_stat(1057 MB used, 29646 MB avail, 30704 MB total, peers []/[] op hist []) 2014-01-13 17:23:33.638437 7f98486a7700 5 osd.0 16 tick 2014-01-13 17:23:33.638475 7f98486a7700 10 osd.0 16 do_waiters -- start 2014-01-13 17:23:33.638479 7f98486a7700 10 osd.0 16 do_waiters -- finish 2014-01-13 17:23:33.758194 7f9836e63700 20 osd.0 16 update_osd_stat osd_stat(1057 MB used, 29646 MB avail, 30704 MB total, peers []/[] op hist []) 2014-01-13 17:23:33.758257 7f9836e63700 5 osd.0 16 heartbeat: osd_stat(1057 MB used, 29646 MB avail, 30704 MB total, peers []/[] op hist []) 2014-01-13 17:23:34.638658 7f98486a7700 5 osd.0 16 tick 2014-01-13 17:23:34.638692 7f98486a7700 10 osd.0 16 do_waiters -- start 2014-01-13 17:23:34.638694 7f98486a7700 10 osd.0 16 do_waiters -- finish 2014-01-13 17:23:35.638936 7f98486a7700 5 osd.0 16 tick . . . and this is what the mon log says: 2014-01-13 17:25:21.670754 7f10474b4700 11 mon.ceph0@0(leader) e1 tick 2014-01-13 17:25:21.670792 7f10474b4700 10 mon.ceph0@0(leader).pg v8 v8: 192 pgs: 192 creating; 0 bytes data, 0 kB used, 0 kB / 0 kB avail 2014-01-13 17:25:21.670821 7f10474b4700 10 mon.ceph0@0(leader).mds e1 e1: 0/0/1 up 2014-01-13 17:25:21.670831 7f10474b4700 10 mon.ceph0@0(leader).osd e7 e7: 2 osds: 0 up, 2 in 2014-01-13 17:25:21.670839 7f10474b4700 20 mon.ceph0@0(leader).osd e7 osd.0 laggy halflife 3600 decay_k -0.000192541 down for 5.000466 decay 0.999038 2014-01-13 17:25:21.670876 7f10474b4700 10 mon.ceph0@0(leader).osd e7 tick entire containing rack subtree for osd.0 is down; resetting timer 2014-01-13 17:25:21.670881 7f10474b4700 20 mon.ceph0@0(leader).osd e7 osd.1 laggy halflife 3600 decay_k -0.000192541 down for 5.000466 decay 0.999038 2014-01-13 17:25:21.670890 7f10474b4700 10 mon.ceph0@0(leader).osd e7 tick entire containing rack subtree for osd.1 is down; resetting timer 2014-01-13 17:25:21.670895 7f10474b4700 1 mon.ceph0@0(leader).paxos(paxos active c 1..260) is_readable now=2014-01-13 17:25:21.670896 lease_expire=0.000000 has v0 lc 260 2014-01-13 17:25:21.670917 7f10474b4700 1 mon.ceph0@0(leader).paxos(paxos active c 1..260) is_readable now=2014-01-13 17:25:21.670918 lease_expire=0.000000 has v0 lc 260 2014-01-13 17:25:21.670927 7f10474b4700 1 mon.ceph0@0(leader).paxos(paxos active c 1..260) is_readable now=2014-01-13 17:25:21.670928 lease_expire=0.000000 has v0 lc 260 2014-01-13 17:25:21.670934 7f10474b4700 10 mon.ceph0@0(leader).log v36 log 2014-01-13 17:25:21.670939 7f10474b4700 10 mon.ceph0@0(leader).auth v207 auth 2014-01-13 17:25:21.670951 7f10474b4700 20 mon.ceph0@0(leader) e1 sync_trim_providers This is what a 'ps aux| grep ceph | grep ceph' yields on each respective node: mon.0 root 6567 0.0 0.4 156984 13084 ? Sl 04:41 0:07 /usr/bin/ceph-mon -i ceph0 --pid-file /var/run/ceph/mon.ceph0.pid -c /etc/ceph/ceph.conf osd.0 root 3435 0.0 0.6 488344 20140 ? Ssl 04:41 0:26 /usr/bin/ceph-osd -i 0 --pid-file /var/run/ceph/osd.0.pid -c /etc/ceph/ceph.conf osd.1 root 2926 0.0 0.6 487080 18912 ? Ssl 04:41 0:29 /usr/bin/ceph-osd -i 1 --pid-file /var/run/ceph/osd.1.pid -c /etc/ceph/ceph.conf This is what 'netstat -tapn | grep -i listen | grep ceph' yields on each respective node: mon.0 tcp 0 0 192.168.10.200:6789 0.0.0.0:* LISTEN 6567/ceph-mon osd.0 tcp 0 0 10.10.10.201:6800 0.0.0.0:* LISTEN 3435/ceph-osd tcp 0 0 192.168.10.201:6800 0.0.0.0:* LISTEN 3435/ceph-osd tcp 0 0 192.168.10.201:6801 0.0.0.0:* LISTEN 3435/ceph-osd tcp 0 0 10.10.10.201:6801 0.0.0.0:* LISTEN 3435/ceph-osd tcp 0 0 192.168.10.201:6802 0.0.0.0:* LISTEN 3435/ceph-osd osd.1 tcp 0 0 10.10.10.202:6800 0.0.0.0:* LISTEN 2926/ceph-osd tcp 0 0 192.168.10.202:6800 0.0.0.0:* LISTEN 2926/ceph-osd tcp 0 0 192.168.10.202:6801 0.0.0.0:* LISTEN 2926/ceph-osd tcp 0 0 10.10.10.202:6801 0.0.0.0:* LISTEN 2926/ceph-osd tcp 0 0 192.168.10.202:6802 0.0.0.0:* LISTEN 2926/ceph-osd Thank you. Best, Moe 1984 |
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com