Peering or connections problem

Mordur Ingolfsson <rass@xxxxxxx> · Mon, 13 Jan 2014 12:36:07 -0500



    Hi List,

    
    I have a Ceph setup consisting of 3 nodes, 1 mon and 2 osd.  It
    seems that both my osds are in but down. the osd processes on the
    osd nodes exist and are listening and I am able to successfully
    telnet to - and from all nodes on the ports to and from all other
    nodes on the respective ports. Still, my pgs are all stuck in this
    status:

    
    pg 2.2 is stuck unclean since forever, current state creating, last
    acting []

    
    Here is my ceph.config:

    
    http://pastebin.com/SpQA38Em

    
    and here is what what 'ceph report'  has to say.

    
    http://pastebin.com/3gPJhpnH

    
    This is what the osd logs show:

    
    2014-01-13 17:23:32.638235 7f98486a7700  5 osd.0 16 tick

    2014-01-13 17:23:32.638270 7f98486a7700 10 osd.0 16 do_waiters --
    start

    2014-01-13 17:23:32.638273 7f98486a7700 10 osd.0 16 do_waiters --
    finish

    2014-01-13 17:23:32.657880 7f9836e63700 20 osd.0 16 update_osd_stat
    osd_stat(1057 MB used, 29646 MB avail, 30704 MB total, peers []/[]
    op hist [])

    2014-01-13 17:23:32.657935 7f9836e63700  5 osd.0 16 heartbeat:
    osd_stat(1057 MB used, 29646 MB avail, 30704 MB total, peers []/[]
    op hist [])

    2014-01-13 17:23:33.638437 7f98486a7700  5 osd.0 16 tick

    2014-01-13 17:23:33.638475 7f98486a7700 10 osd.0 16 do_waiters --
    start

    2014-01-13 17:23:33.638479 7f98486a7700 10 osd.0 16 do_waiters --
    finish

    2014-01-13 17:23:33.758194 7f9836e63700 20 osd.0 16 update_osd_stat
    osd_stat(1057 MB used, 29646 MB avail, 30704 MB total, peers []/[]
    op hist [])

    2014-01-13 17:23:33.758257 7f9836e63700  5 osd.0 16 heartbeat:
    osd_stat(1057 MB used, 29646 MB avail, 30704 MB total, peers []/[]
    op hist [])

    2014-01-13 17:23:34.638658 7f98486a7700  5 osd.0 16 tick

    2014-01-13 17:23:34.638692 7f98486a7700 10 osd.0 16 do_waiters --
    start

    2014-01-13 17:23:34.638694 7f98486a7700 10 osd.0 16 do_waiters --
    finish

    2014-01-13 17:23:35.638936 7f98486a7700  5 osd.0 16 tick

    .

    .

    .

    
    and this is what the mon log says:

    
    2014-01-13 17:25:21.670754 7f10474b4700 11 mon.ceph0@0(leader) e1
    tick

    2014-01-13 17:25:21.670792 7f10474b4700 10 mon.ceph0@0(leader).pg v8
    v8: 192 pgs: 192 creating; 0 bytes data, 0 kB used, 0 kB / 0 kB
    avail

    2014-01-13 17:25:21.670821 7f10474b4700 10 mon.ceph0@0(leader).mds
    e1 e1: 0/0/1 up

    2014-01-13 17:25:21.670831 7f10474b4700 10 mon.ceph0@0(leader).osd
    e7 e7: 2 osds: 0 up, 2 in

    2014-01-13 17:25:21.670839 7f10474b4700 20 mon.ceph0@0(leader).osd
    e7 osd.0 laggy halflife 3600 decay_k -0.000192541 down for 5.000466
    decay 0.999038

    2014-01-13 17:25:21.670876 7f10474b4700 10 mon.ceph0@0(leader).osd
    e7 tick entire containing rack subtree for osd.0 is down; resetting
    timer

    2014-01-13 17:25:21.670881 7f10474b4700 20 mon.ceph0@0(leader).osd
    e7 osd.1 laggy halflife 3600 decay_k -0.000192541 down for 5.000466
    decay 0.999038

    2014-01-13 17:25:21.670890 7f10474b4700 10 mon.ceph0@0(leader).osd
    e7 tick entire containing rack subtree for osd.1 is down; resetting
    timer

    2014-01-13 17:25:21.670895 7f10474b4700  1
    mon.ceph0@0(leader).paxos(paxos active c 1..260) is_readable
    now=2014-01-13 17:25:21.670896 lease_expire=0.000000 has v0 lc 260

    2014-01-13 17:25:21.670917 7f10474b4700  1
    mon.ceph0@0(leader).paxos(paxos active c 1..260) is_readable
    now=2014-01-13 17:25:21.670918 lease_expire=0.000000 has v0 lc 260

    2014-01-13 17:25:21.670927 7f10474b4700  1
    mon.ceph0@0(leader).paxos(paxos active c 1..260) is_readable
    now=2014-01-13 17:25:21.670928 lease_expire=0.000000 has v0 lc 260

    2014-01-13 17:25:21.670934 7f10474b4700 10 mon.ceph0@0(leader).log
    v36 log

    2014-01-13 17:25:21.670939 7f10474b4700 10 mon.ceph0@0(leader).auth
    v207 auth

    2014-01-13 17:25:21.670951 7f10474b4700 20 mon.ceph0@0(leader) e1
    sync_trim_providers

    
    This is what a 'ps aux| grep ceph | grep ceph'  yields on  each
    respective node:

    
    mon.0

    
    root      6567  0.0  0.4 156984 13084 ?        Sl   04:41   0:07
    /usr/bin/ceph-mon -i ceph0 --pid-file /var/run/ceph/mon.ceph0.pid -c
    /etc/ceph/ceph.conf

    
    osd.0

    root      3435  0.0  0.6 488344 20140 ?        Ssl  04:41   0:26
    /usr/bin/ceph-osd -i 0 --pid-file /var/run/ceph/osd.0.pid -c
    /etc/ceph/ceph.conf

    
    osd.1

    root      2926  0.0  0.6 487080 18912 ?        Ssl  04:41   0:29
    /usr/bin/ceph-osd -i 1 --pid-file /var/run/ceph/osd.1.pid -c
    /etc/ceph/ceph.conf

    
    This is what 'netstat  -tapn | grep -i listen | grep ceph'  yields
    on each respective node:

    
    mon.0

    tcp        0      0 192.168.10.200:6789     0.0.0.0:*              
    LISTEN      6567/ceph-mon   

    
    osd.0

    tcp        0      0 10.10.10.201:6800       0.0.0.0:*              
    LISTEN      3435/ceph-osd   

    tcp        0      0 192.168.10.201:6800     0.0.0.0:*              
    LISTEN      3435/ceph-osd   

    tcp        0      0 192.168.10.201:6801     0.0.0.0:*              
    LISTEN      3435/ceph-osd   

    tcp        0      0 10.10.10.201:6801       0.0.0.0:*              
    LISTEN      3435/ceph-osd   

    tcp        0      0 192.168.10.201:6802     0.0.0.0:*              
    LISTEN      3435/ceph-osd 

    
    osd.1

    tcp        0      0 10.10.10.202:6800       0.0.0.0:*              
    LISTEN      2926/ceph-osd   

    tcp        0      0 192.168.10.202:6800     0.0.0.0:*              
    LISTEN      2926/ceph-osd   

    tcp        0      0 192.168.10.202:6801     0.0.0.0:*              
    LISTEN      2926/ceph-osd   

    tcp        0      0 10.10.10.202:6801       0.0.0.0:*              
    LISTEN      2926/ceph-osd   

    tcp        0      0 192.168.10.202:6802     0.0.0.0:*              
    LISTEN      2926/ceph-osd 

    
    Thank you.

    
    Best,

    Moe

    1984 

    
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com