Hi, we had this discussion last week at cephalacon on the consistency of the rgw objects. This is just a thought, not designed well, could be better designed and implemented. Right now, we don't have any tool which check the consistency of the data in a bucket for RGW level. Ceph osd scrub doesn't catch any errors related to rgw except disk errors and osd related issues etc… Rados/Osd doesn't have the context about the data written from RGW at logical level. It treats data as a binary stream of rados objects and does all the validations at a that level. Some silent corruptions at the rados level, might be due to pg incompletes or some disc corruptions(if it happens at primary, scrub overwrites all replicas with wrong data), reading the object can fail or can serve wrong data. And as a result of Incomplete PG, happened due to cache tier osd node coming back after a long time. we are observed some objects with missing chunks, except the header chunk. We had to use some offline tools to find out the damaged objects on the cluster to identify the missing objects from RGW. This lead us to the need for a scrub like tool at RGW. One proposal is to implement on demand scrub kind of mechanism on the bucket or per user. If user is specified, scrub will scan all the buckets of user and reports any problems ever so encountered. This can be part of radosgw-admin command. we can have some optimizations on this. Second implementation can be, adding scrub as a periodic activity in all the RGWs. We can have some policies around a bucket or a user to scrub all the buckets once in 3-6 months. We can scan some percentage of buckets/users every time scrub is invoked, because of the resources needed to run scrub. This engine can be designed separately to take care of user policies on when to schedule and what type scrub to run etc… Scrub can be done in two ways like osd scrub, scrub(shallow ) and a deep one. A non deep scrub a.k.a shallow scrub, will not read the data, instead does a rados stat on the object check the existence and can compare few parameters to check the consistency. It will stat all the chunks/multipart chunks, going through the manifest, if any different sized object compared to the manifest is encountered, it will be reported. PG deep scrub at each PG guarantees the consistency among the replicas, so checking all the replica version is not a requirement here. Second version is, Deep scrub, it will read the data, compare the md5sum with the Etag and make sure the contents are intact. Data is not read from all the replicas, but the functionality is delegated to PG deep scrub. We can have different policies to have these tasks run like osd scrub, shallow scrubs can be run on all the buckets of user once in a month and deep scrub can be schedule once in 3 months or some percentage of buckets in 3 months. Please let us know your thoughts. Current state, we have few offline tools written in python to scrub all the buckets and check the consistency. Will be sending them soon. Thanks, Varada. -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html