Hello, In my "Sanity check" thread I postulated yesterday that to get the same redundancy and resilience for disk failures (excluding other factors) as my proposed setup (2 nodes, 2x 11 3TB HDs RAID6 per node, 2 global hotspares, thus 4 OSDs) the "Ceph way" one would need need something like 6 nodes with 10 3TB HDs each, 3 way replication (to protect against dual disk failures) to get the similar capacity and a 7th identical node to allow for node failure/maintenance. That was basically based on me thinking "must not get caught be a dual disk failure ever again", as that happened twice to me, once with a RAID5 and the expected consequences, once with a RAID10 where I got lucky (8 disks total each time). However something was nagging me at the back of my brain and turned out to be my long forgotten statistics classes in school. ^o^ So I after reading some articles basically telling the same things I found this: https://www.memset.com/tools/raid-calculator/ Now this is based on assumptions, onto which I will add some more, but the last sentence on that page still is quite valid. So lets compare these 2 configurations above, I assumed 75GB/s recovery speed for the RAID6 configuration something I've seen in practice. Basically that's half speed, something that will be lower during busy hours and higher during off peak hours. I made the same assumption for Ceph with a 10Gb/s network, assuming 500GB/s recovery/rebalancing speeds. The rebalancing would have to compete with other replication traffic (likely not much of an issue) and the actual speed/load of the individual drives involved. Note that if we assume a totally quiet setup, were 100% of all resources would be available for recovery the numbers would of course change, but NOT their ratios. I went with the default disk lifetime of 3 years and 0 day replacement time. The latter of course gives very unrealistic results for anything w/o hotspare drive, but we're comparing 2 different beasts here. So that all said, the results of that page that make sense in this comparison are the RAID6 +1 hotspare numbers. As in, how likely is a 3rd drive failure in the time before recovery is complete, the replacement setting of 0 giving us the best possible number and since one would deploy a Ceph cluster with sufficient extra capacity that's what we shall use. For the RAID6 setup (12 HDs total) this gives us a pretty comfortable 1 in 58497.9 ratio of data loss per year. Alas for the 70 HDs in the comparable Ceph configuration we wind up with just a 1 in 13094.31 ratio, which while still quite acceptable clearly shows where this is going. So am I completely off my wagon here? How do people deal with this when potentially deploying hundreds of disks in a single cluster/pool? I mean, when we get too 600 disks (and that's just one rack full, OK, maybe 2 due to load and other issues ^o^) of those 4U 60 disk storage servers (or 72 disk per 4U if you're happy with killing another drive when replacing a faulty one in that Supermicro contraption), that ratio is down to 1 in 21.6 which is way worse than that 8disk RAID5 I mentioned up there. Regards, Christian -- Christian Balzer Network/Systems Engineer chibi@xxxxxxx Global OnLine Japan/Fusion Communications http://www.gol.com/ _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com