Keep in mind performance, as well. Once you start getting into higher 'k' values with EC, you've got a lot more drives involved that need to return completions for operations, and on rotational drives this becomes especially painful. We use 8+3 for a lot of our purposes, as it's a good balance of efficiency, durability (number of complete host failures we can tolerate), and enough performance. It's definitely significantly slower than something like 4+2 or 3x replicated, though. It also means we don't deploy clusters below 14 hosts, so we can tolerate multiple host failures _and still accept writes_. It never fails that you have a host issue, and while working on that, another host dies. Same lessons many learn with RAIDs with single drive redundancy - lose a drive, start a rebuild, another drive fails and data gone. It's almost always the correct response to err on the side of durability when it comes to these decisions, unless the data is unimportant and maximum performance is required. On Tue, Sep 14, 2021 at 8:20 AM Eugen Block <eblock@xxxxxx> wrote: > > Hi, > > consider yourself lucky that you haven't had a host failure. But I > would not draw the wrong conclusions here and change the > failure-domain based on luck. > In our production cluster we have an EC pool for archive purposes, it > all went well for quite some time and last Sunday one of the hosts > suddenly failed, we're still investigating the root cause. Our > failure-domain is host and I'm glad that we chose a suitable EC > profile for that, the cluster is healthy. > > > Also what is the "optimal" like 12:3 or ? > > You should evaluate that the other way around. What are your specific > requirements regarding resiliency (how many hosts can fail at the same > time without data loss)? How many hosts are available? Are you > planning to expand in the near future? Based on this evaluation you > can conclude a few options and choose the best for your requirements. > > Regards, > Eugen > > > Zitat von "Szabo, Istvan (Agoda)" <Istvan.Szabo@xxxxxxxxx>: > > > Hi, > > > > What's your take on an osd based ec-code setup? I've never been > > brave enough to use OSD based crush rule because scared host failure > > but in the last 4 years we have never had any host issue so I'm > > thinking to change to there and use some more cost effective EC. > > > > Also what is the "optimal" like 12:3 or ? > > > > Thank you > > _______________________________________________ > > ceph-users mailing list -- ceph-users@xxxxxxx > > To unsubscribe send an email to ceph-users-leave@xxxxxxx > > > > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx