Yes, My goal is to make it loosing 3 OSD does not lose data. My 6 racks may not be in different rooms but they use 6 different switches, so I want when any switch is down or unreachable, my data can still be accessed. I think it?s not an unrealistic requirement. Thanks! LeiDong. On 9/9/14, 10:02 PM, "Loic Dachary" <loic at dachary.org> wrote: > > >On 09/09/2014 14:21, Lei Dong wrote: >> Thanks loic! >> >> Actually I've found that increase choose_local_fallback_tries can >>help(chooseleaf_tries helps not so significantly), but I'm afraid when >>osd failure happen and need to find new acting set, it may be fail to >>find enough racks again. So I'm trying to find a more guaranteed way in >>case of osd failure. >> >> My profile is nothing special other than k=8 m=3. > >So your goal is to make it so loosing 3 OSD simultaneously does not mean >loosing data. By forcing each rack to hold at most 2 OSDs for a given >object, you make it so loosing a full rack does not mean loosing data. >Are these racks in the same room in the datacenter ? In the event of a >catastrophic failure that permanently destroy one rack, how realistic is >it that the other racks are unharmed ? If the rack is destroyed by fire >and is in a row with the six other racks, there is a very high chance >that the other racks will also be damaged. Note that I am not a system >architect nor a system administrator : I may be completely wrong ;-) If >it turns out that the probability of a single rack to fail entirely and >independently of the others is negligible, it may not be necessary to >make a complex ruleset and instead use the default ruleset. > >My 2cts > >> >> Thanks again! >> >> Leidong >> >> >> >> >> >>> On 2014?9?9?, at ??7:53, "Loic Dachary" <loic at dachary.org> wrote: >>> >>> Hi, >>> >>> It is indeed possible that mapping fails if there are just enough >>>racks to match the constraint. And the probability of a bad mapping >>>increases when the number of PG increases because there is a need for >>>more mapping. You can tell crush to try harder with >>> >>> step set_chooseleaf_tries 10 >>> >>> Be careful though : increasing this number will change mapping. It >>>will not just fix the bad mappings you're seeing, it will also change >>>the mappings that succeeded with a lower value. Once you've set this >>>parameter, it cannot be modified. >>> >>> Would you mind sharing the erasure code profile you plan to work with ? >>> >>> Cheers >>> >>>> On 09/09/2014 12:39, Lei Dong wrote: >>>> Hi ceph users: >>>> >>>> I want to create a customized crush rule for my EC pool (with >>>>replica_size = 11) to distribute replicas into 6 different Racks. >>>> >>>> I use the following rule at first: >>>> >>>> Step take default // root >>>> Step choose firstn 6 type rack// 6 racks, I have and only have 6 racks >>>> Step chooseleaf indep 2 type osd // 2 osds per rack >>>> Step emit >>>> >>>> I looks fine and works fine when PG num is small. >>>> But when pg num increase, there are always some PGs which can not >>>>take all the 6 racks. >>>> It looks like ?Step choose firstn 6 type rack? sometimes returns only >>>>5 racks. >>>> After some investigation, I think it may caused by collision of >>>>choices. >>>> >>>> Then I come up with another solution to solve collision like this: >>>> >>>> Step take rack0 >>>> Step chooseleaf indep 2 type osd >>>> Step emit >>>> Step take rack1 >>>> ?. >>>> (manually take every rack) >>>> >>>> This won?t cause rack collision, because I specify rack by name at >>>>first. But the problem is that osd in rack0 will always be primary osd >>>>because I choose from rack0 first. >>>> >>>> So the question is what is the recommended way to meet such a need >>>>(distribute 11 replicas into 6 racks evenly in case of rack failure)? >>>> >>>> >>>> Thanks! >>>> LeiDong >>>> >>>> >>>> >>>> >>>> _______________________________________________ >>>> ceph-users mailing list >>>> ceph-users at lists.ceph.com >>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> >>> -- >>> Lo?c Dachary, Artisan Logiciel Libre >>> > >-- >Lo?c Dachary, Artisan Logiciel Libre >