planning ceph from admin perspective. Ceph calculator?

Ugis <ugis22@xxxxxxxxx> · Mon, 20 May 2013 17:02:33 +0300

Hi,

Seems more and more that ceph is out in the wild - people are using it for production, development speeds up etc. 

Still how can one pick the configuration suiting his needs?

For example, we wish to replace older SAN IBM/HP storages with ceph, we know iops, bandwidth capabilities of those, but there is no "ceph calculator" to get estimations how many OSDs/hosts we need to match existing storage performance parameters.

We have done some tests internally with up to 7 OSDs(2-3 hosts), but increasing OSD count in such small amounts does not  influence ceph performance considerably/lineary to extrapolate till needed performance. Have been following performance figures in maillists, Marks performance tests and read Inktanks provided reference architecture at http://www.inktank.com/resource/ceph-reference-architecture/ (btw, that doc mentions other "Multi-Rack Object Storage Reference Architecture" which I cannot find, anyone has fount it?). In the end I have come to wild guess that starting ~24 spinning OSDs, 2-3 hosts should match our needed starting performance. At this piont it would be helpful to have some estimation tool or reliable reference that estimation is realistic :).

Sure ceph is dynamic creature, many parameters influence resulting performance(SSD/spinning HDD, network, filesystems, journals, replication count etc.) but still when people face question "can we do it with ceph?" some metology or tools like "ceph calculator" could help to estimate needed HW and consequently come to expected investment needed for that. Admins need to convince management at times on certain solution, right? :)

I have been thinking for solutions to this informantion gap and I see 2 supplementing solutions:
1)create publically available ceph configurations+performance reference lists from real life where people can add their cephs and compare. Just standartized approach for conducting tests must be in place - to compare apples to apples. For example people could specify their OSD host count, OSD count, OSD filesystem, OSD server model, CPU, RAM, network, interfaces(type, speed) ceph version, replica count and the like + provide standardized performance test results of their ceph, like rados bench, fio tests with 4K, 4M, random/sequential, read/write.

Other could look for matching working configuration and compare to their clusters. This should encourage startups with real examples and for existing ceph users look for possible tuning.

2)develop theoretical ceph calculator or formula where one can specify needed performance characteristics(iops, bandwidth,size), specify planned HW parameters(if available) and get estimated ceph configuration(needed hosts,CPUs,RAM,OSDs,network). This should take into consideration HDD count, size, smallest iops per HDD,
network latency, RAM, replica count, connection type to ceph(direct via kernel client, userland, via FC/iscsi proxy etc.) and other influencing
parameters. There will allways be palce for advanced know-how tuning, this would just be for easy estimated calculations to get started.

Both things seem to naturally land  to http://wiki.ceph.com/ and be hosted there as current central ceph knowledge base, right? :)

What do you think on chances of implementing both things?

Ugis

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com