Re: Erasure Coding failure domain (again)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10/04/2019 18.11, Christian Balzer wrote:
> Another thing that crossed my mind aside from failure probabilities caused
> by actual HDDs dying is of course the little detail that most Ceph
> installations will have have WAL/DB (journal) on SSDs, the most typical
> ratio being 1:4. 
> And given the current thread about compaction killing pure HDD OSDs,
> something you may _have_ to do.
> 
> So if you get unlucky and a SSD dies 4 OSDs are irrecoverably lost, unlike
> a dead node that can be recovered.
> Combine that with the background noise of HDDs failing, things got just
> quite a bit scarier. 

Certainly, your failure domain should be at least host, and that changes
the math (even without considering whole-host failure).

Let's say you have 375 hosts and 4 OSDs per host, with the failure
domain correctly set to host. Same 50000 pool PGs as before. Now if 3
hosts die:

50000 / (375 choose 3) =~ 0.57% chance of data loss

This is equivalent to having 3 shared SSDs die.

If 3 random OSDs die in different hosts, the chances of data loss would
be 0.57% / (4^3) =~ 0.00896 % (1 in 4 chance per host that you hit the
OSD a PG actually lives in, and you need to hit all 3). This is
marginally higher than the ~ 0.00891% with uniformly distributed PGs,
because you've eliminated all sets of OSDs which share a host.


-- 
Hector Martin (hector@xxxxxxxxxxxxxx)
Public Key: https://mrcn.st/pub
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux