hi, I have a problem with a cluster being stuck in recovery after osd failure. at first recovery was doing quite well, but now it just sits there without any progress. I currently looks like this: health HEALTH_ERR 36 pgs are stuck inactive for more than 300 seconds 50 pgs backfill_wait 52 pgs degraded 36 pgs down 36 pgs peering 1 pgs recovering 1 pgs recovery_wait 36 pgs stuck inactive 52 pgs stuck unclean 52 pgs undersized recovery 261632/2235446 objects degraded (11.704%) recovery 259813/2235446 objects misplaced (11.622%) recovery 2/1117723 unfound (0.000%) monmap e3: 3 mons at {0=192.168.19.13:6789/0,1=192.168.19.17:6789/0,2=192.168.19.23:6789/0} election epoch 78, quorum 0,1,2 0,1,2 osdmap e7430: 6 osds: 4 up, 4 in; 88 remapped pgs flags sortbitwise pgmap v20023893: 256 pgs, 1 pools, 4366 GB data, 1091 kobjects 8421 GB used, 10183 GB / 18629 GB avail 261632/2235446 objects degraded (11.704%) 259813/2235446 objects misplaced (11.622%) 2/1117723 unfound (0.000%) 168 active+clean 50 active+undersized+degraded+remapped+wait_backfill 36 down+remapped+peering 1 active+recovering+undersized+degraded+remapped 1 active+recovery_wait+undersized+degraded+remapped Is there any way to motivate it to resume recovery? Thanks Philipp
Attachment:
signature.asc
Description: OpenPGP digital signature
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com