Best practice K/M-parameters EC pool

mark.nelson@xxxxxxxxxxx (Mark Nelson) · Fri, 15 Aug 2014 07:11:17 -0500

On 08/15/2014 06:24 AM, Wido den Hollander wrote:
> On 08/15/2014 12:23 PM, Loic Dachary wrote:
>> Hi Erik,
>>
>> On 15/08/2014 11:54, Erik Logtenberg wrote:
>>> Hi,
>>>
>>> With EC pools in Ceph you are free to choose any K and M parameters you
>>> like. The documentation explains what K and M do, so far so good.
>>>
>>> Now, there are certain combinations of K and M that appear to have more
>>> or less the same result. Do any of these combinations have pro's and
>>> con's that I should consider and/or are there best practices for
>>> choosing the right K/M-parameters?
>>>
>
> Loic might have a better anwser, but I think that the more segments (K)
> you have, the heavier recovery. You have to contact more OSDs to
> reconstruct the whole object so that involves more disks doing seeks.
>
> I heard sombody from Fujitsu say that he thought 8/3 was best for most
> situations. That wasn't with Ceph though, but with a different system
> which implemented Erasure Coding.

Performance is definitely lower with more segments in Ceph.  I kind of 
gravitate toward 4/2 or 6/2, though that's just my own preference.

>
>>> For instance, if I choose K = 3 and M = 2, then pg's in this pool will
>>> use 5 OSD's and sustain the loss of 2 OSD's. There is 40% overhead in
>>> this configuration.
>>>
>>> Now, if I were to choose K = 6 and M = 4, I would end up with pg's that
>>> use 10 OSD's and sustain the loss of 4 OSD's, which is statistically not
>>> so much different from the first configuration. Also there is the same
>>> 40% overhead.
>>
>> Although I don't have numbers in mind, I think the odds of loosing two
>> OSD simultaneously are a lot smaller than the odds of loosing four OSD
>> simultaneously. Or am I misunderstanding you when you write
>> "statistically not so much different from the first configuration" ?
>>
>
> Loosing two smaller then loosing four? Is that correct or did you mean
> it the other way around?
>
> I'd say that loosing four OSDs simultaneously is less likely to happen
> then two simultaneously.

This is true, though the more disks you spread your objects across, the 
higher likelihood that any given object will be affected by a lost OSD. 
  The extreme case being that every object is spread across every OSD 
and losing any given OSD affects all objects.  I suppose the severity 
depends on the relative fraction of your erasure coding parameters 
relative to the total number of OSDs.  I think this is perhaps what Erik 
was getting at.

>
>> Cheers
>>
>>> One rather obvious difference between the two configurations is that the
>>> latter requires a cluster with at least 10 OSD's to make sense. But
>>> let's say we have such a cluster, which of the two configurations would
>>> be recommended, and why?
>>>
>>> Thanks,
>>>
>>> Erik.
>>> _______________________________________________
>>> ceph-users mailing list
>>> ceph-users at lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>
>>
>>
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users at lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
>