Hi all, we have a ceph cluster, with currently 360 OSDs in 11 Systems. Last week we were replacing one OSD System with a new one. During that, we had a lot of problems with OSDs crashing on all of our systems. But that is not our current problem. After we got everything up and running again, we still have 3 PGs in the state incomplete. I was checking one of them directly on the systems (replication factor is 3). On two machines the directory was there but empty, on the third one, I found some content. Using ceph_objectstore_tool I exported this PG and imported it on the other nodes. Nothing changed. We only use ceph for providing rbd images. Right now, two of them are unusable, because ceph hangs when someone trys to access content in these pgs. Not bad enough, if I create a new rbd image, ceph is still using the incomplete pgs, so it is a pure gambling if a new volume will be usable or not. That, for now, makes our 900TB ceph cluster unusable because of 3 bad PGs. And right here it seems like I can't to anything. Instructing the ceph cluster to scrub, deep-scrub or repair the pg does nothing, even after several days. Checking which rbd images are affected is also not possible, because rados -p poolname ls hangs forever when it comes to one of the incomplete pgs. ceph osd lost also does actually nothing. So right now, I am OK if I lose the content of these three PGs. So how can I get the cluster back to live without deleting the whole pool which is not for discussion? Regards, Christian P.S. We are using Giant _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com