Hello, As always, this has been discussed in the past, with people taking various bits of "truth" from it and a precise model for failure, the latest one is here: https://wiki.ceph.com/Development/Reliability_model/Final_report And there is this, last time it came up I felt it didn't take all things into account (especially realistic times to full recovery): https://github.com/ceph/ceph-tools/tree/master/models/reliability The more disks you will have, the more likely a triple failure becomes. OTOH as Dan pointed out, the chance of them sharing PGs goes down. In the OPs example of 4 nodes with 4 disks each, having 3 disks fails on 3 different nodes at the same time will certainly loose data. In a cluster of 100 nodes with 12 disks each that certainty becomes a probability. How much of one, I'll leave to people who feel comfortable with those levels of math. Note that I consider a node failure something that might stop my Ceph cluster from working (if it were to drop below min_size), but doesn't result in data loss. Christian On Wed, 10 Jun 2015 09:55:22 +0200 Dan van der Ster wrote: > This is a CRUSH misconception. Triple drive failures only cause data > loss when they share a PG (e.g. ceph pg dump .. those [x,y,z] triples > of OSDs are the only ones that matter). If you have very few OSDs, > then its possibly true that any combination of disks would lead to > failure. But as you increase the number of OSDs, the likelihood of > triple sharing a PG decreases (even though the number of 3-way > combinations increases). > > Cheers, Dan > > On Wed, Jun 10, 2015 at 8:47 AM, Jan Schermer <jan@xxxxxxxxxxx> wrote: > > Hidden danger in the default CRUSH rules is that if you lose 3 drives > > in 3 different hosts at the same time, you _will_ lose data, and not > > just some data but possibly a piece of every rbd volume you have... > > And the probability of that happening is sadly nowhere near zero. We > > had drives drop out of cluster under load, which of course comes when > > a drive fails, then another fails, then another fails… not pretty. > > > > Jan > > > >> On 09 Jun 2015, at 18:11, Robert LeBlanc <robert@xxxxxxxxxxxxx> wrote: > >> > >> Signed PGP part > >> If you are using the default rule set (which I think has min_size 2), > >> you can sustain 1-4 disk failures or one host failures. > >> > >> The reason disk failures vary so wildly is that you can lose all the > >> disks in host. > >> > >> You can lose up to another 4 disks (in the same host) or 1 host > >> without data loss, but I/O will block until Ceph can replicate at > >> least one more copy (assuming the min_size 2 stated above). > >> ---------------- > >> Robert LeBlanc > >> GPG Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 > >> > >> > >> On Tue, Jun 9, 2015 at 9:53 AM, kevin parrikar wrote: > >> > I have 4 node cluster each with 5 disks (4 OSD and 1 Operating > >> > system also hosting 3 monitoring process) with default replica 3. > >> > > >> > Total OSD disks : 16 > >> > Total Nodes : 4 > >> > > >> > How can i calculate the > >> > > >> > Maximum number of disk failures my cluster can handle with out any > >> > impact on current data and new writes. > >> > Maximum number of node failures my cluster can handle with out any > >> > impact on current data and new writes. > >> > > >> > Thanks for any help > >> > > >> > _______________________________________________ > >> > ceph-users mailing list > >> > ceph-users@xxxxxxxxxxxxxx > >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >> > > >> > >> _______________________________________________ > >> ceph-users mailing list > >> ceph-users@xxxxxxxxxxxxxx > >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > _______________________________________________ > > ceph-users mailing list > > ceph-users@xxxxxxxxxxxxxx > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Christian Balzer Network/Systems Engineer chibi@xxxxxxx Global OnLine Japan/Fusion Communications http://www.gol.com/ _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com