Historically I have often but not always found that removing / destroying the affected OSD would clear the inconsistent PG. At one point the logged message was clear about who reported and who was the perp, but then a later release broke that. Not sure what recent releases say, since with Luminous I rarely saw them. Perhaps HDD behavior is more conducive. Depending on your device, unrecovered read errors may not warrant replacement — they often represent routine slipped/ reallocated blocks. In such cases rewriting the data is sufficient. With older releases redeploying the OSD ( or surgically excising affected data ) would suffice. With Nautilus I’m told that ceph-osd ( or BlueStore? ) will rewrite automagically and the OSD will not need to be reprovisioned. It would still be a good idea to keep an eye on escalating rates of reallocation / dwindling spares | percentage spares remaining. One SSD mfg told me that when remaining spares get down to 13% performance will be impacted by 10% and the drive should be considered about to fail. I’ve seen both an HDD model and an SSD model with design / firmware flaws that were tickled by specific Ceph access patterns, so if you experience a pandemic there may be more to it. > On May 23, 2020, at 3:18 AM, Massimo Sgaravatto <massimo.sgaravatto@xxxxxxxxx> wrote: > > When I see this problem usually: > > - I run pg repair > - I remove the OSD from the cluster > - I replace the disk > - I recreate the OSD on the new disk > > Cheers, Massimo > >> On Wed, May 20, 2020 at 9:41 PM Peter Lewis <plewis@xxxxxxxxxxxxxx> wrote: >> >> Hello, >> >> I came across a section of the documentation that I don't quite >> understand. In the section about inconsistent PGs it says if one of the >> shards listed in `rados list-inconsistent-obj` has a read_error the disk is >> probably bad. >> >> Quote from documentation: >> >> https://docs.ceph.com/docs/master/rados/troubleshooting/troubleshooting-pg/#pgs-inconsistent >> `If read_error is listed in the errors attribute of a shard, the >> inconsistency is likely due to disk errors. You might want to check your >> disk used by that OSD.` >> >> I determined that the disk is bad by looking at the output of smartctl. I >> would think that replacing the disk by removing the OSD from the cluster >> and allowing the cluster to recover would fix this inconsistency error >> without having to run `ceph pg repair`. >> >> Can I just replace the OSD and the inconsistency will be resolved by the >> recovery? Or would it be better to run `ceph pg repair` and then replace >> the OSD associated with that bad disk? >> >> Thanks! >> _______________________________________________ >> ceph-users mailing list -- ceph-users@xxxxxxx >> To unsubscribe send an email to ceph-users-leave@xxxxxxx >> > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx