Just to be clear, this is from a cluster that was healthy, had a disk replaced, and hasn't returned to healthy? It's not a new cluster that has never been healthy, right?
Assuming it's an existing cluster, how many OSDs did you replace? It almost looks like you replaced multiple OSDs at the same time, and lost data because of it.
Can you give us the output of `ceph osd tree`, and `ceph pg 2.33 query`?
On Wed, Nov 19, 2014 at 2:14 PM, JIten Shah <jshah2005@xxxxxx> wrote:
After rebuilding a few OSD’s, I see that the pg’s are stuck in degraded mode. Sone are in the unclean and others are in the stale state. Somehow the MDS is also degraded. How do I recover the OSD’s and the MDS back to healthy ? Read through the documentation and on the web but no luck so far.pg 2.33 is stuck unclean since forever, current state stale+active+degraded+remapped, last acting [3]pg 0.30 is stuck unclean since forever, current state stale+active+degraded+remapped, last acting [3]pg 1.31 is stuck unclean since forever, current state stale+active+degraded, last acting [2]pg 2.32 is stuck unclean for 597129.903922, current state stale+active+degraded, last acting [2]pg 0.2f is stuck unclean for 597129.903951, current state stale+active+degraded, last acting [2]pg 1.2e is stuck unclean since forever, current state stale+active+degraded+remapped, last acting [3]pg 2.2d is stuck unclean since forever, current state stale+active+degraded+remapped, last acting [2]pg 0.2e is stuck unclean since forever, current state stale+active+degraded+remapped, last acting [3]pg 1.2f is stuck unclean for 597129.904015, current state stale+active+degraded, last acting [2]pg 2.2c is stuck unclean since forever, current state stale+active+degraded+remapped, last acting [3]pg 0.2d is stuck stale for 422844.566858, current state stale+active+degraded, last acting [2]pg 1.2c is stuck stale for 422598.539483, current state stale+active+degraded+remapped, last acting [3]pg 2.2f is stuck stale for 422598.539488, current state stale+active+degraded+remapped, last acting [3]pg 0.2c is stuck stale for 422598.539487, current state stale+active+degraded+remapped, last acting [3]pg 1.2d is stuck stale for 422598.539492, current state stale+active+degraded+remapped, last acting [3]pg 2.2e is stuck stale for 422598.539496, current state stale+active+degraded+remapped, last acting [3]pg 0.2b is stuck stale for 422598.539491, current state stale+active+degraded+remapped, last acting [3]pg 1.2a is stuck stale for 422598.539496, current state stale+active+degraded+remapped, last acting [3]pg 2.29 is stuck stale for 422598.539504, current state stale+active+degraded+remapped, last acting [3]...6 ops are blocked > 2097.15 sec3 ops are blocked > 2097.15 sec on osd.02 ops are blocked > 2097.15 sec on osd.21 ops are blocked > 2097.15 sec on osd.43 osds have slow requestsrecovery 40/60 objects degraded (66.667%)mds cluster is degradedmds.Lab-cephmon001 at X.X.16.111:6800/3424727 rank 0 is replaying journal—Jiten
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com