Re: Ceph EC K+M

Eugen Block <eblock@xxxxxx> · Mon, 21 Feb 2022 16:14:18 +0000

The customer's requirement was to sustain the loss of one of two  
datacenters and two additional hosts. The crush failure domain is  
"host". There are 10 hosts in each DC, so we put 9 chunks in each DC  
to be able to recover completely if one host fails. This worked quite  
nicely already, they had a power outage in one DC and were very happy  
after the cluster recovered.
I don't know the details about the decision process anymore, we  
discussed a few options and decided this would fit best wrt  
resiliency, storage overhead and the amount of chunks.

Zitat von "Szabo, Istvan (Agoda)" <Istvan.Szabo@xxxxxxxxx>:

What’s the aim to have soo big m number?
How many servers are in this cluster?

Istvan Szabo
Senior Infrastructure Engineer
---------------------------------------------------
Agoda Services Co., Ltd.
e: istvan.szabo@xxxxxxxxx<mailto:istvan.szabo@xxxxxxxxx>
---------------------------------------------------

On 2022. Feb 21., at 19:20, Eugen Block <eblock@xxxxxx> wrote:

Email received from the internet. If in doubt, don't click any link  
nor open any attachment !
________________________________

Hi,

it really depends on the resiliency requirements and the use case. We
have a couple of customers with EC profiles like k=7 m=11. The
potential waste of space as Anthony already mentions has to be
considered, of course. But with regards to performance we haven't
heard any complaints yet, but those clusters I'm referring to are
archives with no high performance requirements but rather high
requiremnts regarding datacenter resiliency.

Regards,
Eugen

Zitat von Anthony D'Atri <anthony.datri@xxxxxxxxx>:

A couple of years ago someone suggested on the list wrote:

3) k should only have small prime factors, power of 2 if possible

I tested k=5,6,8,10,12. Best results in decreasing order: k=8, k=6. All
other choices were poor. The value of m seems not relevant for performance.
Larger k will require more failure domains (more hardware).

I suspect that K being a power of 2 aligns with sharding, for
efficiency and perhaps even minimizing space wasted due to internal
fragmentation.

As K increases, one sees diminishing returns for incremental
raw:usable ratio, so I would think that for most purposes aligning
to a power of 2 wouldn’t have much of a downside.  Depending on your
workload, large values could result in wasted space, analagous to
eg. the dynamics of tiny S3 objects vs a large min_alloc_size  :

https://docs.google.com/spreadsheets/d/1rpGfScgG-GLoIGMJWDixEkqs-On9w8nAUToPQjN8bDI/edit#gid=358760253

I usually recommend a 2,2 profile as a safer alternative to
replication with size=2, and 4,2 for additional space efficiency
while blowing up the fault domain factor, rebuild overhead, etc.

ymmocv

On Feb 18, 2022, at 1:13 PM, ashley@xxxxxxxxxxxxxx wrote:

I have read a few places its recommended to set K to a power of 2,
is this still a “thing” with the latest/current CEPH Versions
(quite a few of this articles are from years ago), or is a non
power of 2 value equally as fine performance wise as a power of 2
now.

Thanks
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx