When I see this problem usually: - I run pg repair - I remove the OSD from the cluster - I replace the disk - I recreate the OSD on the new disk Cheers, Massimo On Wed, May 20, 2020 at 9:41 PM Peter Lewis <plewis@xxxxxxxxxxxxxx> wrote: > Hello, > > I came across a section of the documentation that I don't quite > understand. In the section about inconsistent PGs it says if one of the > shards listed in `rados list-inconsistent-obj` has a read_error the disk is > probably bad. > > Quote from documentation: > > https://docs.ceph.com/docs/master/rados/troubleshooting/troubleshooting-pg/#pgs-inconsistent > `If read_error is listed in the errors attribute of a shard, the > inconsistency is likely due to disk errors. You might want to check your > disk used by that OSD.` > > I determined that the disk is bad by looking at the output of smartctl. I > would think that replacing the disk by removing the OSD from the cluster > and allowing the cluster to recover would fix this inconsistency error > without having to run `ceph pg repair`. > > Can I just replace the OSD and the inconsistency will be resolved by the > recovery? Or would it be better to run `ceph pg repair` and then replace > the OSD associated with that bad disk? > > Thanks! > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx