Hi guys!
After get a cluster for 10 servers and build a image for a pool of 450TB, it got sucked at the mkfs moment and what I noticed was that my entire cluster was failing in a really weird way, some osds goes down and up from different nodes and repeat
so now I have a lot of PGs degraded and stuck :S I don’t know why the cluster behave in that way (osds going up and down) do you think is a bug?
[root@capricornio ceph-cluster]# ceph status
cluster d39f6247-1543-432d-9247-6c56f65bb6cd
health HEALTH_WARN 109 pgs degraded; 251 pgs down; 1647 pgs peering; 2598 pgs stale; 1618 pgs stuck inactive; 1643 pgs stuck unclean; 109 pgs undersized; 1 requests are blocked > 32 sec; recovery 13/1838 objects degraded (0.707%); 1/107 in
osds are down
monmap e1: 3 mons at {capricornio=192.168.4.44:6789/0,geminis=192.168.4.37:6789/0,tauro=192.168.4.36:6789/0}, election epoch 50, quorum 0,1,2 tauro,geminis,capricornio
osdmap e5275: 119 osds: 106 up, 107 in
pgmap v15826: 8192 pgs, 1 pools, 2016 MB data, 919 objects
48484 MB used, 388 TB / 388 TB avail
13/1838 objects degraded (0.707%)
2162 stale+active+clean
1031 peering
4274 active+clean
3 stale+remapped+peering
32 stale+active+undersized+degraded
43 stale+down+peering
4 remapped+peering
77 active+undersized+degraded
208 down+peering
358 stale+peering
Think before you print.
|
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com