Sending the mail again without pesky html tags. On Fri, Jun 5, 2015 at 2:40 PM, Loic Dachary <loic@dachary.orgwrote: >(...) >Why do you think the current interface is insufficient ? What would you >need in addition ? I am not sure whether or not the interface is sufficient. Let me try to explain my assumptions, so that we can clarify. Lets say for the sake of simplicity that I have a system of 14 HDDs, with an allocation unit size of 4K, and am using K = 10 systematic drives, and M = 4 redundancy drives (and no spares). Lets say that I am using an encoding scheme with 8 substripes (for each of the 14 stripes), and only one object size, that perfectly matches the scheme: 4K * 10 * 8. My raw objects are then of 80 chunks, and i am adding 32 chunks of redundancy data, making each encoded object take up 4K*8*14 = 448K. The chunks are to be physically stored with these offsets: HDD0: chunks 0-7 HDD1: chunks 8-15 (...) HDD13 (redundancy 3): chunks Assuming that the coding scheme is MDS, the encoding scheme would guarantee recovery of up to 4 lost hard drives. It would not guarantee recovery for 32 arbitrary chunks (which is the same data amount when considering a single object), as they would have to be organized in adjacent groups of 8. Assuming the crush map may be used to configure this sort of chunk placement, perhaps the interface is indeed sufficient ? And, not specific to the interface definition, but about how Ceph uses the interface during operation and during tests: Would the interface receive decode requests for sets of chunks that are not organized in groups of 8? Would the subpacketization (or grouping of chunks) create problems with the unit tests? Do you experts see any other implications or side-effects? Motivation; The required read access for recovering one drive in a (14 total disks,10 systematic data disks) setup using Reed.Solomon, is 10. This can theoretically be reduced by ~40% by introducing substripes (splitting each of the 14 parts into many smaller parts, but fundamentally storing the first 10 major parts in exactly the same way on the HDD, meaning that the I/O of normal reads are not impacted at all). There are many trade-offs to consider, and so we wish to test the performance differences. Sincerely, Sindre B. Stene -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html