If I may, here's one more idea for improving the bluestore repairs. In the CERN and RAL environments, inconsistent objects are most often the result of "weak writes", which I've heard Lars discussing in the past. In this case, the read fails during client IO or deep-scrub, and SMART increments the Pending Sector counter. The only way to know if the sector is truly bad is to try writing the same sector again. So, when (auto) repairing a bluestore object, could we first try to overwrite the object in-place? -- Dan On Wed, Mar 6, 2019 at 11:38 PM David Zafman <dzafman@xxxxxxxxxx> wrote: > > > Improvements to auto repair > > ------------------------ > > We should allow auto repair for bluestore pools since it has built in > checksums. Currently, we are limited to erasure coded pools. > > In order to trigger a auto repair when regular scrub detects errors, > any errors should immediately schedule a deep-scrub. > > Add a new pg state flag "failed_repair" when repairs can't fix all > errors. This may be tricky to implement because pg repair ends as a > recovery operation. > > Set failed_repair if primary repair triggered by a client read fails. > > Add a count of number of objects that are repaired to PG stats and OSD > stats. > > > David > >