Here is my crushmap. You can see our general setup. We are using the bottom rule for the EC pool. We are trying to get to the point where we can lose an entire host and the cluster will continue to work. # begin crush map tunable choose_local_tries 0 tunable choose_local_fallback_tries 0 tunable choose_total_tries 50 tunable chooseleaf_descend_once 1 tunable chooseleaf_vary_r 1 tunable chooseleaf_stable 1 tunable straw_calc_version 1 tunable allowed_bucket_algs 54 # devices device 0 osd.0 class hdd device 1 osd.1 class hdd device 2 osd.2 class hdd device 3 osd.3 class hdd device 4 osd.4 class hdd device 5 osd.5 class hdd device 6 osd.6 class hdd device 7 osd.7 class hdd device 8 osd.8 class hdd device 9 osd.9 class hdd device 10 osd.10 class hdd device 11 osd.11 class hdd device 12 osd.12 class hdd device 13 osd.13 class hdd device 14 osd.14 class hdd device 15 osd.15 class hdd device 16 osd.16 class hdd device 17 osd.17 class hdd device 18 osd.18 class hdd device 19 osd.19 class hdd device 20 osd.20 class hdd device 21 osd.21 class hdd device 22 osd.22 class hdd device 23 osd.23 class hdd device 24 osd.24 class hdd device 25 osd.25 class hdd device 26 osd.26 class hdd device 27 osd.27 class hdd device 28 osd.28 class hdd device 29 osd.29 class hdd device 30 osd.30 class hdd device 31 osd.31 class hdd device 32 osd.32 class hdd device 33 osd.33 class hdd device 34 osd.34 class hdd device 35 osd.35 class hdd # types type 0 osd type 1 host type 2 chassis type 3 rack type 4 row type 5 pdu type 6 pod type 7 room type 8 datacenter type 9 region type 10 root # buckets host osd01tv01 { id -3 # do not change unnecessarily id -4 class hdd # do not change unnecessarily # weight 109.152 alg straw2 hash 0 # rjenkins1 item osd.0 weight 9.096 item osd.3 weight 9.096 item osd.6 weight 9.096 item osd.9 weight 9.096 item osd.12 weight 9.096 item osd.15 weight 9.096 item osd.18 weight 9.096 item osd.21 weight 9.096 item osd.24 weight 9.096 item osd.27 weight 9.096 item osd.30 weight 9.096 item osd.33 weight 9.096 } host osd02tv01 { id -5 # do not change unnecessarily id -6 class hdd # do not change unnecessarily # weight 109.152 alg straw2 hash 0 # rjenkins1 item osd.1 weight 9.096 item osd.4 weight 9.096 item osd.7 weight 9.096 item osd.10 weight 9.096 item osd.13 weight 9.096 item osd.16 weight 9.096 item osd.19 weight 9.096 item osd.22 weight 9.096 item osd.25 weight 9.096 item osd.28 weight 9.096 item osd.31 weight 9.096 item osd.34 weight 9.096 } host osd03tv01 { id -7 # do not change unnecessarily id -8 class hdd # do not change unnecessarily # weight 109.152 alg straw2 hash 0 # rjenkins1 item osd.2 weight 9.096 item osd.5 weight 9.096 item osd.8 weight 9.096 item osd.11 weight 9.096 item osd.14 weight 9.096 item osd.17 weight 9.096 item osd.20 weight 9.096 item osd.23 weight 9.096 item osd.26 weight 9.096 item osd.29 weight 9.096 item osd.32 weight 9.096 item osd.35 weight 9.096 } root default { id -1 # do not change unnecessarily id -2 class hdd # do not change unnecessarily # weight 327.441 alg straw2 hash 0 # rjenkins1 item osd01tv01 weight 109.147 item osd02tv01 weight 109.147 item osd03tv01 weight 109.147 } # rules rule replicated_rule { id 0 type replicated min_size 1 max_size 10 step take default step chooseleaf firstn 0 type host step emit } rule default.rgw.buckets.data { id 1 type erasure min_size 3 max_size 3 step set_chooseleaf_tries 5 step set_choose_tries 100 step take default step choose indep 2 type host step choose indep 2 type osd step emit } # end crush map Thanks again for all the help! Tim Gipson Systems Engineer On 11/12/17, 10:57 PM, "Christian Wuerdig" <christian.wuerdig@xxxxxxxxx> wrote: Well, as stated in the other email I think in the EC scenario you can set size=k+m for the pgcalc tool. If you want 10+2 then in theory you should be able to get away with 6 nodes to survive a single node failure if you can guarantee that every node will always receive 2 out of the 12 chunks - looks like this might be achievable: http://ceph.com/planet/erasure-code-on-small-clusters/ On Mon, Nov 13, 2017 at 1:32 PM, Tim Gipson <tgipson@xxxxxxx> wrote: > I guess my questions are more centered around k+m and PG calculations. > > As we were starting to build and test our EC pools with our infrastructure we were trying to figure out what our calculations needed to be starting with 3 OSD hosts with 12 x 10 TB OSDs a piece. The nodes have the ability to expand to 24 drives a piece and we hope to eventually get to around a 1PB cluster after we add some more hosts. Initially we hoped to be able to do a k=10 m=2 on the pool but I am not sure that is going to be feasible. We’d like to set up the failure domain so that we would be able to lose an entire host without losing the cluster. At this point I’m not sure that’s possible without bringing in more hosts. > > Thanks for the help! > > Tim Gipson > > > On 11/12/17, 5:14 PM, "Christian Wuerdig" <christian.wuerdig@xxxxxxxxx> wrote: > > I might be wrong, but from memory I think you can use > http://ceph.com/pgcalc/ and use k+m for the size > > On Sun, Nov 12, 2017 at 5:41 AM, Ashley Merrick <ashley@xxxxxxxxxxxxxx> wrote: > > Hello, > > > > Are you having any issues with getting the pool working or just around the > > PG num you should use? > > > > ,Ashley > > > > Get Outlook for Android > > > > ________________________________ > > From: ceph-users <ceph-users-bounces@xxxxxxxxxxxxxx> on behalf of Tim Gipson > > <tgipson@xxxxxxx> > > Sent: Saturday, November 11, 2017 5:38:02 AM > > To: ceph-users@xxxxxxxxxxxxxx > > Subject: Erasure Coding Pools and PG calculation - > > documentation > > > > Hey all, > > > > I’m having some trouble setting up a Pool for Erasure Coding. I haven’t > > found much documentation around the PG calculation for an Erasure Coding > > pool. It seems from what I’ve tried so far that the math needed to set one > > up is different than the math you use to calculate PGs for a regular > > replicated pool. > > > > Does anyone have any experience setting up a pool this way and can you give > > me some help or direction, or point me toward some documentation that goes > > over the math behind this sort of pool setup? > > > > Any help would be greatly appreciated! > > > > Thanks, > > > > > > Tim Gipson > > Systems Engineer > > > > > > _______________________________________________ > > ceph-users mailing list > > ceph-users@xxxxxxxxxxxxxx > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > _______________________________________________ > > ceph-users mailing list > > ceph-users@xxxxxxxxxxxxxx > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com