Re: calculating maximum number of disk and node failure that can be handled by cluster with out data loss

Dan van der Ster <dan@xxxxxxxxxxxxxx> · Wed, 10 Jun 2015 09:55:22 +0200

This is a CRUSH misconception. Triple drive failures only cause data
loss when they share a PG (e.g. ceph pg dump .. those [x,y,z] triples
of OSDs are the only ones that matter). If you have very few OSDs,
then its possibly true that any combination of disks would lead to
failure. But as you increase the number of OSDs, the likelihood of
triple sharing a PG decreases (even though the number of 3-way
combinations increases).

Cheers, Dan

On Wed, Jun 10, 2015 at 8:47 AM, Jan Schermer <jan@xxxxxxxxxxx> wrote:
> Hidden danger in the default CRUSH rules is that if you lose 3 drives in 3 different hosts at the same time, you _will_ lose data, and not just some data but possibly a piece of every rbd volume you have...
> And the probability of that happening is sadly nowhere near zero. We had drives drop out of cluster under load, which of course comes when a drive fails, then another fails, then another fails… not pretty.
>
> Jan
>
>> On 09 Jun 2015, at 18:11, Robert LeBlanc <robert@xxxxxxxxxxxxx> wrote:
>>
>> Signed PGP part
>> If you are using the default rule set (which I think has min_size 2),
>> you can sustain 1-4 disk failures or one host failures.
>>
>> The reason disk failures vary so wildly is that you can lose all the
>> disks in host.
>>
>> You can lose up to another 4 disks (in the same host) or 1 host
>> without data loss, but I/O will block until Ceph can replicate at
>> least one more copy (assuming the min_size 2 stated above).
>> ----------------
>> Robert LeBlanc
>> GPG Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
>>
>>
>> On Tue, Jun 9, 2015 at 9:53 AM, kevin parrikar  wrote:
>> > I have 4 node cluster each with 5 disks (4 OSD and 1 Operating system also
>> > hosting 3 monitoring process) with default replica 3.
>> >
>> > Total OSD disks : 16
>> > Total Nodes : 4
>> >
>> > How can i calculate the
>> >
>> > Maximum number of disk failures my cluster can handle with out  any impact
>> > on current data and new writes.
>> > Maximum number of node failures  my cluster can handle with out any impact
>> > on current data and new writes.
>> >
>> > Thanks for any help
>> >
>> > _______________________________________________
>> > ceph-users mailing list
>> > ceph-users@xxxxxxxxxxxxxx
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >
>>
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@xxxxxxxxxxxxxx
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com