I’d like to better understand the current state of my CEPH cluster. I currently have 2 PG that are in the ‘stuck unclean’ state: # ceph health detail HEALTH_WARN 2 pgs down; 2 pgs peering; 2 pgs stuck inactive; 2 pgs stuck unclean pg 4.2a8 is stuck inactive for 124516.777791, current state down+peering, last acting [79,8,74] pg 4.c3 is stuck inactive since forever, current state down+peering, last acting [56,79,67] pg 4.2a8 is stuck unclean for 124536.223284, current state down+peering, last acting [79,8,74] pg 4.c3 is stuck unclean since forever, current state down+peering, last acting [56,79,67] pg 4.2a8 is down+peering, acting [79,8,74] pg 4.c3 is down+peering, acting [56,79,67] While my cluster does currently have some down OSD, none are in the acting set for either PG: ceph osd tree | grep down 73 1.00000 osd.73 down 0 1.00000 96 1.00000 osd.96 down 0 1.00000 110 1.00000 osd.110 down 0 1.00000 116 1.00000 osd.116 down 0 1.00000 120 1.00000 osd.120 down 0 1.00000 126 1.00000 osd.126 down 0 1.00000 124 1.00000 osd.124 down 0 1.00000 119 1.00000 osd.119 down 0 1.00000 I’ve queried one of the two PG, and see that recovery is currently blocked on OSD.116, which is indeed down, but is not part of the acting set of OSD for that PG: This is all with CEPH version 0.94.3: # ceph version ceph version 0.94.3 (95cefea9fd9ab740263bf8bb4796fd864d9afe2b) Why does this PG remain ‘stuck unclean’? Is there some steps I can take to unstick it, given that all the acting OSD are up and in? (* Re-sent, now that I’m subscribed to list *) -Chris |
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com