Crushmap ruleset for rack aware PG placement

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Daniel,

I see the core dump now, thank you. http://tracker.ceph.com/issues/9490

Cheers

On 16/09/2014 18:39, Daniel Swarbrick wrote:
> Hi Loic,
> 
> Thanks for providing a detailed example. I'm able to run the example
> that you provide, and also got my own live crushmap to produce some
> results, when I appended the "--num-rep 3" option to the command.
> Without that option, even your example is throwing segfaults - maybe a
> bug in crushtool?
> 
> One other area I wasn't sure about - can the final "chooseleaf" step
> specify "firstn 0" for simplicity's sake (and to automatically handle a
> larger pool size in future) ? Would there be any downside to this?
> 
> Cheers
> 
> On 16/09/14 16:20, Loic Dachary wrote:
>> Hi Daniel,
>>
>> When I run
>>
>> crushtool --outfn crushmap --build --num_osds 100 host straw 2 rack straw 10 default straw 0
>> crushtool -d crushmap -o crushmap.txt
>> cat >> crushmap.txt <<EOF
>> rule myrule {
>> 	ruleset 1
>> 	type replicated
>> 	min_size 1
>> 	max_size 10
>> 	step take default
>> 	step choose firstn 2 type rack
>> 	step chooseleaf firstn 2 type host
>> 	step emit
>> }
>> EOF
>> crushtool -c crushmap.txt -o crushmap
>> crushtool -i crushmap --test --show-utilization --rule 1 --min-x 1 --max-x 10 --num-rep 3
>>
>> I get
>>
>> rule 1 (myrule), x = 1..10, numrep = 3..3
>> CRUSH rule 1 x 1 [79,69,10]
>> CRUSH rule 1 x 2 [56,58,60]
>> CRUSH rule 1 x 3 [30,26,19]
>> CRUSH rule 1 x 4 [14,8,69]
>> CRUSH rule 1 x 5 [7,4,88]
>> CRUSH rule 1 x 6 [54,52,37]
>> CRUSH rule 1 x 7 [69,67,19]
>> CRUSH rule 1 x 8 [51,46,83]
>> CRUSH rule 1 x 9 [55,56,35]
>> CRUSH rule 1 x 10 [54,51,95]
>> rule 1 (myrule) num_rep 3 result size == 3:	10/10
>>
>> What command are you running to get a core dump ?
>>
>> Cheers
>>
>> On 16/09/2014 12:02, Daniel Swarbrick wrote:
>>> On 15/09/14 17:28, Sage Weil wrote:
>>>> rule myrule {
>>>> 	ruleset 1
>>>> 	type replicated
>>>> 	min_size 1
>>>> 	max_size 10
>>>> 	step take default
>>>> 	step choose firstn 2 type rack
>>>> 	step chooseleaf firstn 2 type host
>>>> 	step emit
>>>> }
>>>>
>>>> That will give you 4 osds, spread across 2 hosts in each rack.  The pool 
>>>> size (replication factor) is 3, so RADOS will just use the first three (2 
>>>> hosts in first rack, 1 host in second rack).
>>> I have a similar requirement, where we currently have four nodes, two in
>>> each fire zone, with pool size 3. At the moment, due to the number of
>>> nodes, we are guaranteed at least one replica in each fire zone (which
>>> we represent with bucket type "room"). If we add more nodes in future,
>>> the current ruleset may cause all three replicas of a PG to land in a
>>> single zone.
>>>
>>> I tried the ruleset suggested above (replacing "rack" with "room"), but
>>> when testing it with crushtool --test --show-utilization, I simply get
>>> segfaults. No amount of fiddling around seems to make it work - even
>>> adding two new hypothetical nodes to the crushmap doesn't help.
>>>
>>> What could I perhaps be doing wrong?
>>>
>>> _______________________________________________
>>> ceph-users mailing list
>>> ceph-users at lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>
>>
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users at lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> 
> _______________________________________________
> ceph-users mailing list
> ceph-users at lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 

-- 
Lo?c Dachary, Artisan Logiciel Libre

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 263 bytes
Desc: OpenPGP digital signature
URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20140916/544b9f6f/attachment-0001.pgp>


[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux