Hi Craig, I assume the reason for the 48 hours recovery time is to keep the cost of the cluster low ? I wrote "1h recovery time" because it is roughly the time it would take to move 4TB over a 10Gb/s link. Could you upgrade your hardware to reduce the recovery time to less than two hours ? Or are there factors other than cost that prevent this ? Cheers On 26/08/2014 19:37, Craig Lewis wrote: > My OSD rebuild time is more like 48 hours (4TB disks, >60% full, osd max backfills = 1). I believe that increases my risk of failure by 48^2 . Since your numbers are failure rate per hour per disk, I need to consider the risk for the whole time for each disk. So more formally, rebuild time to the power of (replicas -1). > > So I'm at 2304/100,000,000, or approximately 1/43,000. That's a much higher risk than 1 / 10^8. > > > A risk of 1/43,000 means that I'm more likely to lose data due to human error than disk failure. Still, I can put a small bit of effort in to optimize recovery speed, and lower this number. Managing human error is much harder. > > > > > > > On Tue, Aug 26, 2014 at 7:12 AM, Loic Dachary <loic at dachary.org <mailto:loic at dachary.org>> wrote: > > Using percentages instead of numbers lead me to calculations errors. Here it is again using 1/100 instead of % for clarity ;-) > > Assuming that: > > * The pool is configured for three replicas (size = 3 which is the default) > * It takes one hour for Ceph to recover from the loss of a single OSD > * Any other disk has a 1/100,000 chance to fail within the hour following the failure of the first disk (assuming AFR https://en.wikipedia.org/wiki/Annualized_failure_rate of every disk is 8%, divided by the number of hours during a year == (0.08 / 8760) ~= 1/100,000 > * A given disk does not participate in more than 100 PG > -- Lo?c Dachary, Artisan Logiciel Libre -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 263 bytes Desc: OpenPGP digital signature URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20140826/7f63e686/attachment.pgp>