Hi,
can you share `ceph osd tree`? What crush rules are in use in your
cluster? I assume that the two failed OSDs prevent the remapping
because the rules can't be applied.
Regards,
Eugen
Zitat von Philipp Schwaha <philipp@xxxxxxxxxxx>:
hi,
I have a problem with a cluster being stuck in recovery after osd
failure. at first recovery was doing quite well, but now it just sits
there without any progress. I currently looks like this:
health HEALTH_ERR
36 pgs are stuck inactive for more than 300 seconds
50 pgs backfill_wait
52 pgs degraded
36 pgs down
36 pgs peering
1 pgs recovering
1 pgs recovery_wait
36 pgs stuck inactive
52 pgs stuck unclean
52 pgs undersized
recovery 261632/2235446 objects degraded (11.704%)
recovery 259813/2235446 objects misplaced (11.622%)
recovery 2/1117723 unfound (0.000%)
monmap e3: 3 mons at
{0=192.168.19.13:6789/0,1=192.168.19.17:6789/0,2=192.168.19.23:6789/0}
election epoch 78, quorum 0,1,2 0,1,2
osdmap e7430: 6 osds: 4 up, 4 in; 88 remapped pgs
flags sortbitwise
pgmap v20023893: 256 pgs, 1 pools, 4366 GB data, 1091 kobjects
8421 GB used, 10183 GB / 18629 GB avail
261632/2235446 objects degraded (11.704%)
259813/2235446 objects misplaced (11.622%)
2/1117723 unfound (0.000%)
168 active+clean
50 active+undersized+degraded+remapped+wait_backfill
36 down+remapped+peering
1 active+recovering+undersized+degraded+remapped
1 active+recovery_wait+undersized+degraded+remapped
Is there any way to motivate it to resume recovery?
Thanks
Philipp
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com