I have a few more questions. -Can files stored in the OSD heal "incrementally"? Suppose there are 3 replicas for a large file and that a small byte range change occurs while replica 3 is down. Will replica 3 heal efficiently when it returns? Will only the small changed byte range be transferred? -Also, can reads be spreadout over replicas? This might be a nice optimization to reduce seek times under certain conditions, when there are no writers or the writer is the only reader (and thus is aware of all the writes even before they complete). Under these conditions it seems like it would be possible to not enforce the "tail reading" order of replicas and thus additionally benefit from "read stripping" across the replicas the way many raid implementations do with RAID1. I thought that this might be particularly useful for RBD when it is used exclusively (say by mounting a local FS) since even with replicas, it seems like it could then relax the replica tail reading constraint. Any thoughts? Thanks, -Martin -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html