> I've been running similar calculations recently. I've been using this
> tool from Inktank to calculate RADOS reliabilities with different
> assumptions:
> https://github.com/ceph/ceph-tools/tree/master/models/reliability
>
> But I've also had similar questions about RBD (or any multi-part files
> stored in RADOS) -- naively, a file/device stored in N objects would
> be N times less reliable than a single object. But I hope there's an
> error in that logic.
It's worth pointing out that Ceph's RGW will actually stripe S3 objects across many RADOS objects - even when it's not a multi-part upload, this has been the case since the Bobtail release. There is a in depth Google paper about availability modeling, it might provide some insight into what the math should look like:
http://research.google.com/pubs/archive/36737.pdf
When reading it you can think of objects as chunks and pgs as stripes. CRUSH should be configured based on failure domains that cause correlated failures, ie power and networking. You also want to consider the availability of the facility itself:
"Typical availability estimates used in the industry range from 99.7% availability for tier II datacenters to 99.98% and 99.995% for tiers III and IV, respectively."
http://www.morganclaypool.com/doi/pdf/10.2200/s00193ed1v01y200905cac006
If you combine the cluster availability metric and the facility availability metric, you might be surprised. A cluster with 99.995% availability in a Tier II facility is going to be dragged down to 99.7% availability. If a cluster goes down in the forest, does anyone know?
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com