Hi,
I have a production cluster on which 1 OSD on a failing disk was slowing the whole cluster down. I removed the OSD (osd.87) like usual in such case but this time it resulted in 17 unfound objects. I no longer have the files from osd.87. I was able to call "ceph pg PGID mark_unfound_lost delete" on 10 of those objects.On the remaining objects 7 the command blocks. When I try to do "ceph pg PGID query" on this PG it also blocks. I suspect this is same reason why mark_unfound blocks.
Other client IO to PGs that have unfound objects are also blocked. When trying to query the OSDs which has the PG with unfound objects, "ceph tell" blocks.
I tried to mark the PG as complete using ceph-objectstore-tool but it did not help as the PG is in fact complete but for some reason blocks.
I tried recreating an empty osd.87 and importing the PG exported from other replica but it did not help.
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com