I have considered how I want to implement the first cut of data redundancy in tabled, and came with the following plan: - Checksum for data is to be stored in both the db4 entry and the metadata in chunk. - tabled gets a new "thread" (probably not really a thread) that walks keys in db4 by record number, then asks all listed chunkservers to verify if the OID is present (actually asks to fetch the metadata). If not present, it updates the db4. If the reported checksum mismatches the one in db4, it tells chunkserver to drop that object (optionally). If in the end the object's redundancy is insufficient, replication is scheduled. - chunkservers run checksums by themselves, without a command from tabled, verify against checksums in the metadata that they have. If object fails, it's removed (or maked dead, I haven't decided yet). - a separate process running at tabled processes scheduled replications. The usual applies: control bandwidth and prevent hogging up the network and servers, batch up OIDs between same chunkservers, etc. This all looks nicely parallel except for the db4 scanner, thanks to decoupling of checksumming from tabled's affairs. I decided not to do anything about the db4 at this stage, because any improvements need a new database. I have a vague plan for that as well, but it must wait. The above is what will make tabled useable for the general public. -- Pete -- To unsubscribe from this list: send the line "unsubscribe hail-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html