Dear all, Ceph 0.72.2 is deployed in three hosts. But the ceph's status is HEALTH_WARN . The status is as follows: # ceph -s cluster e25909ed-25d9-42fd-8c97-0ed31eec6194 health HEALTH_WARN 768 pgs degraded; 768 pgs stuck unclean; recovery 2/3 objects degraded (66.667%) monmap e3: 3 mons at {ceph-node1=192.168.57.101:6789/0,ceph-node2=192.168.57.102:6789/0,ceph-node3=192.168.57.103:6789/0}, election epoch 34, quorum 0,1,2 ceph-node1,ceph-node2,ceph-node3 osdmap e170: 9 osds: 9 up, 9 in pgmap v1741: 768 pgs, 7 pools, 36 bytes data, 1 objects 367 MB used, 45612 MB / 45980 MB avail 2/3 objects degraded (66.667%) 768 active+degraded
There are 3 pools created, but 7 pools appears in above ceph status. # ceph osd lspools 5 data,6 metadata,7 rbd,
The object in pool 'data' justs has one replication. But the pool's replication is set as 3. # ceph osd map data object1 osdmap e170 pool 'data' (5) object 'object1' -> pg 5.bac5debc (5.bc) -> up [6] acting [6] # ceph osd dump|more epoch 170 fsid e25909ed-25d9-42fd-8c97-0ed31eec6194 created 2015-03-16 11:23:28.805286 modified 2015-03-19 15:45:39.451077 flags pool 5 'data' rep size 3 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 256 pgp_num 256 last_change 155 owner 0 pool 6 'metadata' rep size 2 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 256 pgp_num 256 last_change 161 owner 0 pool 7 'rbd' rep size 2 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 256 pgp_num 256 last_change 163 owner 0
Other info is depicted here. # ceph osd tree # id weight type name up/down reweight -1 0 root default -7 0 rack rack03 -4 0 host ceph-node3 6 0 osd.6 up 1 7 0 osd.7 up 1 8 0 osd.8 up 1 -6 0 rack rack02 -3 0 host ceph-node2 3 0 osd.3 up 1 4 0 osd.4 up 1 5 0 osd.5 up 1 -5 0 rack rack01 -2 0 host ceph-node1 0 0 osd.0 up 1 1 0 osd.1 up 1 2 0 osd.2 up 1
The crushmap is : # begin crush map tunable choose_local_tries 0 tunable choose_local_fallback_tries 0 tunable choose_total_tries 50 tunable chooseleaf_descend_once 1 # devices device 0 osd.0 device 1 osd.1 device 2 osd.2 device 3 osd.3 device 4 osd.4 device 5 osd.5 device 6 osd.6 device 7 osd.7 device 8 osd.8 # types type 0 osd type 1 host type 2 rack type 3 row type 4 room type 5 datacenter type 6 root # buckets host ceph-node3 { id -4 # do not change unnecessarily # weight 0.000 alg straw hash 0 # rjenkins1 item osd.6 weight 0.000 item osd.7 weight 0.000 item osd.8 weight 0.000 } rack rack03 { id -7 # do not change unnecessarily # weight 0.000 alg straw hash 0 # rjenkins1 item ceph-node3 weight 0.000 } host ceph-node2 { id -3 # do not change unnecessarily # weight 0.000 alg straw hash 0 # rjenkins1 item osd.3 weight 0.000 item osd.4 weight 0.000 item osd.5 weight 0.000 } rack rack02 { id -6 # do not change unnecessarily # weight 0.000 alg straw hash 0 # rjenkins1 item ceph-node2 weight 0.000 } host ceph-node1 { id -2 # do not change unnecessarily # weight 0.000 alg straw hash 0 # rjenkins1 item osd.0 weight 0.000 item osd.1 weight 0.000 item osd.2 weight 0.000 } rack rack01 { id -5 # do not change unnecessarily # weight 0.000 alg straw hash 0 # rjenkins1 item ceph-node1 weight 0.000 } root default { id -1 # do not change unnecessarily # weight 0.000 alg straw hash 0 # rjenkins1 item rack03 weight 0.000 item rack02 weight 0.000 item rack01 weight 0.000 } # rules rule data { ruleset 0 type replicated min_size 1 max_size 10 step take default step chooseleaf firstn 0 type host step emit } # end crush map # ceph health detail |more HEALTH_WARN 768 pgs degraded; 768 pgs stuck unclean; recovery 2/3 objects degraded (66.667%) pg 5.17 is stuck unclean since forever, current state active+degraded, last acting [6] pg 6.14 is stuck unclean since forever, current state active+degraded, last acting [6] pg 7.15 is stuck unclean since forever, current state active+degraded, last acting [6] pg 5.14 is stuck unclean since forever, current state active+degraded, last acting [6] pg 6.17 is stuck unclean since forever, current state active+degraded, last acting [6] pg 7.16 is stuck unclean since forever, current state active+degraded, last acting [6] pg 5.15 is stuck unclean since forever, current state active+degraded, last acting [6] pg 6.16 is stuck unclean since forever, current state active+degraded, last acting [6] pg 7.17 is stuck unclean since forever, current state active+degraded, last acting [6]
I had researched this problem for one week, but no solution is found. Does anyone tell me how to fix it? Thanks! Regards, Guanghua
|