Hi everyone, It seems like I hit Bug #44286 Cache tiering shows unfound objects after OSD reboots <https://tracker.ceph.com/issues/44286>. I did stop some OSD's to compact the RocksDB on them. Noout was set during this time. Soon after that i got: [ERR] PG_DAMAGED: Possible data damage: 4 pgs recovery_unfound pg 8.8 is active+recovery_unfound+degraded, acting [42,43,39], 1 unfound pg 8.14 is active+recovery_unfound+degraded, acting [43,40,42], 1 unfound pg 8.3b is active+recovery_unfound+degraded, acting [36,40,43], 1 unfound pg 8.50 is active+recovery_unfound+degraded, acting [39,38,36], 1 unfound ceph pg 8.8 list_unfound { "num_missing": 1, "num_unfound": 1, "objects": [ { "oid": { "oid": "hit_set_8.8_archive_2022-08-12 12:12:06.515941Z_2022-08-12 12:18:16.186156Z", "key": "", "snapid": -2, "hash": 8, "max": 0, "pool": 8, "namespace": ".ceph-internal" }, "need": "118438'7610615", "have": "0'0", "flags": "none", "clean_regions": "clean_offsets: [], clean_omap: 0, new_object: 1", "locations": [] } ], "state": "NotRecovering", "available_might_have_unfound": true, "might_have_unfound": [], "more": false } The other missing objects look the same. The oid is hit_set_* So i guess no data is affected. The question is how to get rid of the error. This is a cache pool with replica x3 in front of a cephfs with ec 6+2. Affected are the hit set objects from the cache pool. Everything seems to work so far. The cluster is in "HEALTH_ERR: Possible data damage: 4 pgs recovery_unfound" though. I could not get the PG's do deep scrub to find the missing objects. It also did not work when i disabled scrubbing on all OSD's except the affected ones. Repairing the PG does also not start since it's a scrub operation as well. They are just queued for deep scrub but nothing is happening. I did try ceph pg deep-scrub 8.8 ceph pg repair 8.8 I also tried to set one of the primary OSD out, but the affected PG did stay on that OSD. What's the best course of action to get the cluster back to a healthy state? Should i make ceph pg 8.8 mark_unfound_lost revert or ceph pg 8.8 mark_unfound_lost delete Or is there another way? Would the cache pool still work after that? Thanks, Eric _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx