On Wed, Nov 11, 2015 at 8:44 PM, kefu chai <tchaikov@xxxxxxxxx> wrote: > currently, scrub and repair are pretty primitive. there are several > improvements which need to be made: > [snip] > - repair will create a new version so that possibly corrupted copies > on down OSDs will get fixed naturally. If this new feature is executed by end users manually, it may be better to implement dry-run mechanism so that the above process could be skipped, and end users initialize scrub process with more information, and maybe more safely. Make sense? Cheers, Shinobu > > so librados will offer enough information and facilities, with which a > smart librados client/script will be able to fix the inconsistencies > found in the scrub. > > as an example, if we run into a data inconsistency where the 3 > replicas failed to agree with each other after performing a deep > scrub. probably we'd like to have an election to get the auth copy. > following pseudo code explains how we will implement this using the > new rados APIs for scrub and repair. > > # something is not necessarily better than nothing > rados.aio_scrub(pg, completion) > completion.wait_for_complete() > for pool in rados.get_inconsistent_pools(): > for pg in rados.get_inconsistent_pgs(pool): > # rados.get_inconsistent_pgs() throws if "epoch" expires > > for oid, inconsistent in rados.get_inconsistent_pgs(pg, > epoch).items(): > if inconsistent.is_data_digest_mismatch(): > votes = defaultdict(int) > for osd, shard_info in inconsistent.shards: > votes[shard_info.object_info.data_digest] += 1 > digest, _ = mavotes, key=operator.itemgetter(1)) > auth_copy = None > for osd, shard_info in inconsistent.shards.items(): > if shard_info.object_info.data_digest == digest: > auth_copy = osd > break > repair_op = librados.ObjectWriteOperation() > repair_op.repair_pick(auth_copy, > inconsistent.ver, epoch) > rados.aio_operate_scrub(oid, repair_op) > > this plan was also discussed in the infernalis CDS. see > http://tracker.ceph.com/projects/ceph/wiki/Osd_-_Scrub_and_Repair. > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Email: shinobu@xxxxxxxxx shinobu@xxxxxxxxxx -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html