Re: No Ceph Recovery : Is it a bug ?

Sage Weil <sage@xxxxxxxxxxxx> · Thu, 19 May 2016 03:47:14 -0400 (EDT)

On Thu, 19 May 2016, Gaurav Bafna wrote:
> Hi Cephers ,
> 
> In our production cluster at Reliance Jio, when as osd goes corrupt
> and crashes, Cluster remains unhealthy even after 4 hours.
> 
>     cluster fac04d85-db48-4564-b821-deebda046261
>      health HEALTH_WARN
>             658 pgs degraded
>             658 pgs stuck degraded
>             688 pgs stuck unclean
>             658 pgs stuck undersized
>             658 pgs undersized
              ^^^ this...

>             recovery 3064/1981308 objects degraded (0.155%)
>             recovery 124/1981308 objects misplaced (0.006%)
>      monmap e11: 11 mons at
> {dssmon2=10.140.208.224:6789/0,dssmon3=10.140.208.225:6789/0,dssmon31=10.135.38.141:6789/0,dssmon32=10.135.38.142:6789/0,dssmon33=10.135.38.143:6789/0,dssmon34=10.135.38.144:6789/0,dssmon35=10.135.38.145:6789/0,dssmon4=10.140.208.226:6789/0,dssmon5=10.140.208.227:6789/0,dssmon6=10.140.208.228:6789/0,dssmonleader1=10.140.208.223:6789/0}
>             election epoch 792, quorum 0,1,2,3,4,5,6,7,8,9,10
> dssmon31,dssmon32,dssmon33,dssmon34,dssmon35,dssmonleader1,dssmon2,dssmon3,dssmon4,dssmon5,dssmon6
>      osdmap e8778: 2774 osds: 2746 up, 2746 in; 30 remapped pgs
                               doesn't match this ^^

which makes it look like a problem with OSDs reporting PG state to the 
mon.  The fact that an OSD restarts supports that theory.

What version is this?  A bunch of the osd -> mon pg reporting code was 
recently rewritten (between infernalis and jewel), so the new code is 
hopefully more robust.  (OTOH, it is also new, so we may have missed 
something.)

Nice big cluster!

sage
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html