Hi, I have a 3x-replicated pool with Ceph 12.2.7. One HDD broke, its OSD "2" was automatically marked as "out", the disk was physically replaced by a new one, and that added back in. Now `ceph health detail` continues to permanently show: [ERR] OSD_SCRUB_ERRORS: 1 scrub errors [ERR] PG_DAMAGED: Possible data damage: 1 pg inconsistent pg 2.87 is active+clean+inconsistent, acting [33,2,20] What exactly is wrong here? Why can Ceph not fix the issue? With BlueStore I have checksums, on two unbroken disks, so what remaining inconsistency can there be? The suggested command in https://docs.ceph.com/en/pacific/rados/operations/pg-repair/#commands-for-diagnosing-pg-problems does not work: # rados list-inconsistent-obj 2.87 No scrub information available for pg 2.87 error 2: (2) No such file or directory Further, I find the documentation in https://docs.ceph.com/en/pacific/rados/operations/pg-repair/#more-information-on-pg-repair extremely unclear. It says
In the case of replicated pools, recovery is beyond the scope of pg repair.
while many people on the Internet suggest that `ceph pg repair` might fix the issue. Yet again others claim that Ceph will fix the issue itself. I am hesitant to run "ceph pg repair" without understanding what the problem is and what exactly this will do. I have already reported the "error 2" and the documentation in issue https://tracker.ceph.com/issues/61739 but not received a reply yet, and my cluster stays "inconsistent". How can this be fixed? I would appreciate any help! _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx