Best practice K/M-parameters EC pool

loic@xxxxxxxxxxx (Loic Dachary) · Fri, 15 Aug 2014 16:56:58 +0200

On 15/08/2014 15:42, Erik Logtenberg wrote:
>>>
>>> I haven't done the actual calculations, but given some % chance of disk
>>> failure, I would assume that losing x out of y disks has roughly the
>>> same chance as losing 2*x out of 2*y disks over the same period.
>>>
>>> That's also why you generally want to limit RAID5 arrays to maybe 6
>>> disks or so and move to RAID6 for bigger arrays. For arrays bigger than
>>> 20 disks you would usually split those into separate arrays, just to
>>> keep the (parity disks / total disks) fraction high enough.
>>>
>>> With regard to data safety I would guess that 3+2 and 6+4 are roughly
>>> equal, although the behaviour of 6+4 is probably easier to predict
>>> because bigger numbers makes your calculations less dependent on
>>> individual deviations in reliability.
>>>
>>> Do you guys feel this argument is valid?
>>
>> Here is how I reason about it, roughly:
>>
>> If the probability of loosing a disk is 0.1%, the probability of loosing two disks simultaneously (i.e. before the failure can be recovered) would be 0.1*0.1 = 0.01% and three disks becomes 0.1*0.1*0.1 = 0.001% and four disks becomes 0.0001% 
>>
>> Accurately calculating the reliability of the system as a whole is a lot more complex (see https://wiki.ceph.com/Development/Add_erasure_coding_to_the_durability_model/ for more information).
>>
>> Cheers
> 
> Okay, I see that in your calculation, you leave the total amount of
> disks completely out of the equation. 

Yes. If you have a small number of disks I'm not sure how to calculate the durability. For instance if I have 50 disk cluster within a rack, the durability is dominated by the probability that the rack is set on fire and increasing m from 3 to 5 is most certainly pointless ;-)

> The link you provided is very
> useful indeed and does some actual calculations. Interestingly, the
> example in the details page [1] use k=32 and m=32 for a total of 64 blocks.
> Those are very much bigger values than Mark Nelson mentioned earlier. Is
> that example merely meant to demonstrate the theoretical advantages, or
> would you actually recommend using those numbers in practice.
> Let's assume that we have at least 64 OSD's available, would you
> recommend k=32 and m=32?

It is theoretical, I'm not aware of any Ceph use case requiring that kind of setting. There may be a use case though, it's not absurd, just not common. I would be happy to hear about it.

Cheers

> 
> [1]
> https://wiki.ceph.com/Development/Add_erasure_coding_to_the_durability_model/Technical_details_on_the_model
> 
> _______________________________________________
> ceph-users mailing list
> ceph-users at lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 

-- 
Lo?c Dachary, Artisan Logiciel Libre

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 263 bytes
Desc: OpenPGP digital signature
URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20140815/61de5b6d/attachment.pgp>