On Thu, 19 May 2016, Gaurav Bafna wrote: > Hi Cephers , > > In our production cluster at Reliance Jio, when as osd goes corrupt > and crashes, Cluster remains unhealthy even after 4 hours. > > cluster fac04d85-db48-4564-b821-deebda046261 > health HEALTH_WARN > 658 pgs degraded > 658 pgs stuck degraded > 688 pgs stuck unclean > 658 pgs stuck undersized > 658 pgs undersized ^^^ this... > recovery 3064/1981308 objects degraded (0.155%) > recovery 124/1981308 objects misplaced (0.006%) > monmap e11: 11 mons at > {dssmon2=10.140.208.224:6789/0,dssmon3=10.140.208.225:6789/0,dssmon31=10.135.38.141:6789/0,dssmon32=10.135.38.142:6789/0,dssmon33=10.135.38.143:6789/0,dssmon34=10.135.38.144:6789/0,dssmon35=10.135.38.145:6789/0,dssmon4=10.140.208.226:6789/0,dssmon5=10.140.208.227:6789/0,dssmon6=10.140.208.228:6789/0,dssmonleader1=10.140.208.223:6789/0} > election epoch 792, quorum 0,1,2,3,4,5,6,7,8,9,10 > dssmon31,dssmon32,dssmon33,dssmon34,dssmon35,dssmonleader1,dssmon2,dssmon3,dssmon4,dssmon5,dssmon6 > osdmap e8778: 2774 osds: 2746 up, 2746 in; 30 remapped pgs doesn't match this ^^ which makes it look like a problem with OSDs reporting PG state to the mon. The fact that an OSD restarts supports that theory. What version is this? A bunch of the osd -> mon pg reporting code was recently rewritten (between infernalis and jewel), so the new code is hopefully more robust. (OTOH, it is also new, so we may have missed something.) Nice big cluster! sage -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html