We have a great session with Loic, Christopher, Sam, and Greg that discussed how to move forward with erasure coding support. The high-level consensus on approach: - it is possible to do erasure coding above rados across distinct pools, but it is harder, and less useful. - we should have an ErasureCodedPG that takes advantage of CRUSH's placement to place shards - we can support a limited subset of rados operations for such pools and still be useful (write_full, or write/append on block boundaries) - this will be used in conjuction with a replicated pool as a second tier of storage, or by applications that are happy with a limited subset of commands. That said, the implementation will be non-trivial. But, we identified several areas where code cleanup will move us down the right path. By factoring our useful components of PG and ReplicatedPG into separate classes, we clean up the current interfaces and can also build unit tests for them as we do so for immediate benefit. Initial focus areas: - clean up the OSD -> PG interface (PG -> OSD is already reasonably well captured by teh OSDService class) - ObjectContext tracking - PG log handling - PG missing - RepOp state - Peering state machine The last one willb e most tricky (and saved for last). In each case, we'll have to think carefully about how well things generalize from replication to erasure coding. Loic has volunteered to own this work, and Sam and I will be supporting. He'll also be joining our daily core standup. Yay! sage -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html