EC cluster design considerations

Adrien Gillard <gillard.adrien@xxxxxxxxx> · Fri, 3 Jul 2015 15:47:54 +0200

Hi everyone,
I am currently looking at Ceph to build a cluster to backup VMs. I am leveraging the solution against others like traditionnal SANs, etc. and to this point Ceph is economically more interesting and technically more challenging (not to bother me :) ).

OSD hosts should be based on Dell R730xd hardware, I plan to put 3 SSD and 9 OSD (4TB) per host.
I need approximately 100TB and, in order to save some space and still get the level of resiliency you can expect for backups, i am leaning towards EC (4+2) and 7 hosts.

I would like some input on the questions that still remain :

 - I can put more OSD directly inside the server (up to 4 additional disks) but that would require to power down the host to replace an "inner" OSD in case of failure. I was thinking I could add 3 internal disks to have 12 OSD per node instead of 9 for a higher density, at the cost of a more complex maintenance and higher risk for the cluster as there would be 4 OSD journals per SSD instead of 3. How manageable is bringing down a complete node for replacing a disk ? noout will surely come into play. How will the cluster behave when the host is back online to sync data ?

 - I also wonder about SSD failure,even if I intend to use Intel 3700 or at least 3610, in order not to be bothered with such issues :) So, in case of a SSD failure, the cluster should start backfilling / rebalancing the data of 3 to 4 OSDs. With proper monitoring and spare disks, one could replace the SSD within hours and avoid the impacts of backfilling lots of data, but this would require a fine tuning of how OSD are marked out. I know it is a bit bending the natural features and behaviours of Ceph but has anyone tested this approch ? With custom monitoring scripts or others ? Would you think it can be considered or the only way is to buy SSD that can sustain the load ? Also same question as above, how do Ceph handle down OSD that are set up after a while ?

 - My goal is to reach a bandwidth of several hundreds of MBytes of mostly sequential writes. Do you think a cluster of this type and size will be able to handle it ? The only benchmarks I could find on the mailing list are Loic's on EC plugins and Roy's on a full SSD EC backend.

Lots of thanks in advance,

Adrien

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com