Okay, thank you for the clarification! :D - - - - - - - - - - - - - - - - - - - - Sangdi Xu UIS 2, Team BORE > -----Original Message----- > From: Sage Weil [mailto:sweil@xxxxxxxxxx] > Sent: Monday, January 25, 2016 9:25 PM > To: xusangdi 11976 (RD) > Cc: ceph-devel@xxxxxxxxxxxxxxx > Subject: RE: question about the 'r' value in CRUSH > > On Mon, 25 Jan 2016, Xusangdi wrote: > > In normal scenario, this behavior is not that outstanding, but it does exist. Please see below: > > > > $ crushtool -o crushmap --build --num_osds 12 host straw2 3 root > > straw2 0 $ crushtool -i crushmap --test --x 0 --num-rep 3 > > --show-mappings --weight 0 0 > [...] > > The retries are normal. You just need to leave choose_tries to a large enough value (default is now 50, > I believe) so that you always get a full sized result. It is a problem if you have a very small set of > potentially suitable results, though (as with your 3-item tree with skewed weights, or perhaps a larger > tree but with most of the OSDs marked 'out'). > > sage > > > > > CRUSHCHOOSE_LEAF bucket -5 x 0 outpos 0 numrep 3 tries 51 > > recurse_tries 1 local_retries 0 local_fallback_retries 0 parent_r 0 stable 0 crush_bucket_choose -5 > x=0 r=0 > > item -1 type 1 > > CHOOSE bucket -1 x 0 outpos 0 numrep 1 tries 1 recurse_tries 0 > > local_retries 0 local_fallback_retries 0 parent_r 0 stable 0 crush_bucket_choose -1 x=0 r=0 > > item 0 type 0 > > reject 1 collide 0 ftotal 1 flocal 1 skip rep CHOOSE returns 0 > > reject 1 collide 0 ftotal 1 flocal 1 crush_bucket_choose -5 x=0 > > r=1 > > item -4 type 1 > > CHOOSE bucket -4 x 0 outpos 0 numrep 1 tries 1 recurse_tries 0 > > local_retries 0 local_fallback_retries 0 parent_r 0 stable 0 crush_bucket_choose -4 x=0 r=0 > > item 11 type 0 > > CHOOSE got 11 > > CHOOSE returns 1 > > CHOOSE got -4 > > crush_bucket_choose -5 x=0 r=1 <== redundant try > > item -4 type 1 > > reject 0 collide 1 ftotal 1 flocal 1 crush_bucket_choose -5 x=0 > > r=2 > > item -4 type 1 > > reject 0 collide 1 ftotal 2 flocal 1 crush_bucket_choose -5 x=0 > > r=3 > > item -2 type 1 > > CHOOSE bucket -2 x 0 outpos 1 numrep 2 tries 1 recurse_tries 0 > > local_retries 0 local_fallback_retries 0 parent_r 0 stable 0 crush_bucket_choose -2 x=0 r=1 > > item 5 type 0 > > CHOOSE got 5 > > CHOOSE returns 2 > > CHOOSE got -2 > > crush_bucket_choose -5 x=0 r=2 <== redundant try > > item -4 type 1 > > reject 0 collide 1 ftotal 1 flocal 1 crush_bucket_choose -5 x=0 > > r=3 <== redundant try > > item -2 type 1 > > reject 0 collide 1 ftotal 2 flocal 1 crush_bucket_choose -5 x=0 > > r=4 > > item -2 type 1 > > reject 0 collide 1 ftotal 3 flocal 1 crush_bucket_choose -5 x=0 > > r=5 > > item -1 type 1 > > CHOOSE bucket -1 x 0 outpos 2 numrep 3 tries 1 recurse_tries 0 > > local_retries 0 local_fallback_retries 0 parent_r 0 stable 0 crush_bucket_choose -1 x=0 r=2 > > item 0 type 0 > > reject 1 collide 0 ftotal 1 flocal 1 skip rep CHOOSE returns 2 > > reject 1 collide 0 ftotal 4 flocal 1 crush_bucket_choose -5 x=0 > > r=6 > > item -4 type 1 > > reject 0 collide 1 ftotal 5 flocal 1 crush_bucket_choose -5 x=0 > > r=7 > > item -4 type 1 > > reject 0 collide 1 ftotal 6 flocal 1 crush_bucket_choose -5 x=0 > > r=8 > > item -2 type 1 > > reject 0 collide 1 ftotal 7 flocal 1 crush_bucket_choose -5 x=0 > > r=9 > > item -4 type 1 > > reject 0 collide 1 ftotal 8 flocal 1 crush_bucket_choose -5 x=0 > > r=10 > > item -3 type 1 > > CHOOSE bucket -3 x 0 outpos 2 numrep 3 tries 1 recurse_tries 0 > > local_retries 0 local_fallback_retries 0 parent_r 0 stable 0 crush_bucket_choose -3 x=0 r=2 > > item 7 type 0 > > CHOOSE got 7 > > CHOOSE returns 3 > > CHOOSE got -3 > > CHOOSE returns 3 > > rule 0 x 0 [11,5,7] > > > > - - - - - - - - - - - - - - - - - - - - Sangdi Xu UIS 2, Team BORE > > > > > -----Original Message----- > > > From: Sage Weil [mailto:sweil@xxxxxxxxxx] > > > Sent: Sunday, January 24, 2016 10:35 PM > > > To: xusangdi 11976 (RD) > > > Cc: ceph-devel@xxxxxxxxxxxxxxx > > > Subject: Re: question about the 'r' value in CRUSH > > > > > > On Sat, 23 Jan 2016, Xusangdi wrote: > > > > Hi Sage, > > > > > > > > Recently we encountered an interesting case when learning about CRUSH, please see below: > > > > > > > > root root { > > > > id -4 # do not change unnecessarily > > > > # weight 36.000 > > > > alg straw2 > > > > hash 0 # rjenkins1 > > > > item host0 weight 3.000 > > > > item host1 weight 3.000 > > > > item host2 weight 30.000 > > > > } > > > > > > > > CRUSHCHOOSE_LEAF bucket -4 x 3 outpos 0 numrep 3 tries 51 > > > > recurse_tries 1 local_retries 0 local_fallback_retries 0 parent_r > > > > 0 stable 0 crush_bucket_choose -4 > > > x=3 r=0 > > > > item -3 type 1 > > > > CHOOSE bucket -3 x 3 outpos 0 numrep 1 tries 1 recurse_tries 0 > > > > local_retries 0 local_fallback_retries 0 parent_r 0 stable 0 crush_bucket_choose -3 x=3 r=0 > > > > item 8 type 0 > > > > CHOOSE got 8 > > > > CHOOSE returns 1 > > > > CHOOSE got -3 > > > > crush_bucket_choose -4 x=3 r=1 > > > > item -3 type 1 > > > > reject 0 collide 1 ftotal 1 flocal 1 crush_bucket_choose -4 > > > > x=3 > > > > r=2 > > > > item -3 type 1 > > > > reject 0 collide 1 ftotal 2 flocal 1 crush_bucket_choose -4 > > > > x=3 > > > > r=3 > > > > item -3 type 1 > > > > reject 0 collide 1 ftotal 3 flocal 1 crush_bucket_choose -4 > > > > x=3 > > > > r=4 > > > > item -3 type 1 > > > > reject 0 collide 1 ftotal 4 flocal 1 crush_bucket_choose -4 > > > > x=3 > > > > r=5 > > > > item -3 type 1 > > > > reject 0 collide 1 ftotal 5 flocal 1 crush_bucket_choose -4 > > > > x=3 > > > > r=6 > > > > item -3 type 1 > > > > reject 0 collide 1 ftotal 6 flocal 1 crush_bucket_choose -4 > > > > x=3 > > > > r=7 > > > > item -3 type 1 > > > > reject 0 collide 1 ftotal 7 flocal 1 crush_bucket_choose -4 > > > > x=3 > > > > r=8 > > > > item -3 type 1 > > > > reject 0 collide 1 ftotal 8 flocal 1 crush_bucket_choose -4 > > > > x=3 > > > > r=9 > > > > item -3 type 1 > > > > reject 0 collide 1 ftotal 9 flocal 1 crush_bucket_choose -4 > > > > x=3 > > > > r=10 > > > > item -3 type 1 > > > > reject 0 collide 1 ftotal 10 flocal 1 crush_bucket_choose -4 > > > > x=3 > > > > r=11 > > > > item -3 type 1 > > > > reject 0 collide 1 ftotal 11 flocal 1 crush_bucket_choose -4 > > > > x=3 > > > > r=12 > > > > item -3 type 1 > > > > reject 0 collide 1 ftotal 12 flocal 1 crush_bucket_choose -4 > > > > x=3 > > > > r=13 > > > > item -3 type 1 > > > > reject 0 collide 1 ftotal 13 flocal 1 crush_bucket_choose -4 > > > > x=3 > > > > r=14 > > > > item -3 type 1 > > > > reject 0 collide 1 ftotal 14 flocal 1 crush_bucket_choose -4 > > > > x=3 > > > > r=15 > > > > item -3 type 1 > > > > reject 0 collide 1 ftotal 15 flocal 1 crush_bucket_choose -4 > > > > x=3 > > > > r=16 > > > > item -3 type 1 > > > > reject 0 collide 1 ftotal 16 flocal 1 crush_bucket_choose -4 > > > > x=3 > > > > r=17 > > > > item -3 type 1 > > > > reject 0 collide 1 ftotal 17 flocal 1 crush_bucket_choose -4 > > > > x=3 > > > > r=18 > > > > item -3 type 1 > > > > reject 0 collide 1 ftotal 18 flocal 1 crush_bucket_choose -4 > > > > x=3 > > > > r=19 > > > > item -2 type 1 > > > > CHOOSE bucket -2 x 3 outpos 1 numrep 2 tries 1 recurse_tries 0 > > > > local_retries 0 local_fallback_retries 0 parent_r 0 stable 0 crush_bucket_choose -2 x=3 r=1 > > > > item 4 type 0 > > > > CHOOSE got 4 > > > > CHOOSE returns 2 > > > > CHOOSE got -2 > > > > crush_bucket_choose -4 x=3 r=2 > > > > item -3 type 1 > > > > reject 0 collide 1 ftotal 1 flocal 1 crush_bucket_choose -4 > > > > x=3 > > > > r=3 > > > > item -3 type 1 > > > > reject 0 collide 1 ftotal 2 flocal 1 crush_bucket_choose -4 > > > > x=3 > > > > r=4 > > > > item -3 type 1 > > > > reject 0 collide 1 ftotal 3 flocal 1 crush_bucket_choose -4 > > > > x=3 > > > > r=5 > > > > item -3 type 1 > > > > reject 0 collide 1 ftotal 4 flocal 1 crush_bucket_choose -4 > > > > x=3 > > > > r=6 > > > > item -3 type 1 > > > > reject 0 collide 1 ftotal 5 flocal 1 crush_bucket_choose -4 > > > > x=3 > > > > r=7 > > > > item -3 type 1 > > > > reject 0 collide 1 ftotal 6 flocal 1 crush_bucket_choose -4 > > > > x=3 > > > > r=8 > > > > item -3 type 1 > > > > reject 0 collide 1 ftotal 7 flocal 1 crush_bucket_choose -4 > > > > x=3 > > > > r=9 > > > > item -3 type 1 > > > > reject 0 collide 1 ftotal 8 flocal 1 crush_bucket_choose -4 > > > > x=3 > > > > r=10 > > > > item -3 type 1 > > > > reject 0 collide 1 ftotal 9 flocal 1 crush_bucket_choose -4 > > > > x=3 > > > > r=11 > > > > item -3 type 1 > > > > reject 0 collide 1 ftotal 10 flocal 1 crush_bucket_choose -4 > > > > x=3 > > > > r=12 > > > > item -3 type 1 > > > > reject 0 collide 1 ftotal 11 flocal 1 crush_bucket_choose -4 > > > > x=3 > > > > r=13 > > > > item -3 type 1 > > > > reject 0 collide 1 ftotal 12 flocal 1 crush_bucket_choose -4 > > > > x=3 > > > > r=14 > > > > item -3 type 1 > > > > reject 0 collide 1 ftotal 13 flocal 1 crush_bucket_choose -4 > > > > x=3 > > > > r=15 > > > > item -3 type 1 > > > > reject 0 collide 1 ftotal 14 flocal 1 crush_bucket_choose -4 > > > > x=3 > > > > r=16 > > > > item -3 type 1 > > > > reject 0 collide 1 ftotal 15 flocal 1 crush_bucket_choose -4 > > > > x=3 > > > > r=17 > > > > item -3 type 1 > > > > reject 0 collide 1 ftotal 16 flocal 1 crush_bucket_choose -4 > > > > x=3 > > > > r=18 > > > > item -3 type 1 > > > > reject 0 collide 1 ftotal 17 flocal 1 crush_bucket_choose -4 > > > > x=3 > > > > r=19 > > > > item -2 type 1 > > > > reject 0 collide 1 ftotal 18 flocal 1 crush_bucket_choose -4 > > > > x=3 > > > > r=20 > > > > item -2 type 1 > > > > reject 0 collide 1 ftotal 19 flocal 1 crush_bucket_choose -4 > > > > x=3 > > > > r=21 > > > > item -3 type 1 > > > > reject 0 collide 1 ftotal 20 flocal 1 crush_bucket_choose -4 > > > > x=3 > > > > r=22 > > > > item -3 type 1 > > > > reject 0 collide 1 ftotal 21 flocal 1 crush_bucket_choose -4 > > > > x=3 > > > > r=23 > > > > item -1 type 1 > > > > CHOOSE bucket -1 x 3 outpos 2 numrep 3 tries 1 recurse_tries 0 > > > > local_retries 0 local_fallback_retries 0 parent_r 0 stable 0 crush_bucket_choose -1 x=3 r=2 > > > > item 0 type 0 > > > > CHOOSE got 0 > > > > CHOOSE returns 3 > > > > CHOOSE got -1 > > > > CHOOSE returns 3 > > > > rule 0 x 3 [8,4,0] > > > > > > > > It looks that when choosing the third replica, we repeat a lot of > > > > tries which have already tried for the > > > second one. > > > > Is this intended or maybe an issue? > > > > > > It's because one bucket is weighted so much more heavily than the others. > > > You're asking for something that's somewhat impossible: 3 replicas > > > coming from 3 items with different weights. Since we're sampling, > > > if the weights are too skewed you'll run out of tries (r values) and it'll give up with only 2 results. > > > > > > sage > > ---------------------------------------------------------------------- > > --------------------------------------------------------------- > > 本邮件及其附件含有杭州华三通信技术有限公司的保密信息,仅限于发送给上面地址中列出 > > 的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、 > > 或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本 > > 邮件! > > This e-mail and its attachments contain confidential information from > > H3C, which is intended only for the person or entity whose address is > > listed above. Any use of the information contained herein in any way > > (including, but not limited to, total or partial disclosure, > > reproduction, or dissemination) by persons other than the intended > > recipient(s) is prohibited. If you receive this e-mail in error, > > please notify the sender by phone or email immediately and delete it! > > N?????r??y??????X??ǧv???){.n?????z?]z????ay?ʇڙ??j ??f???h??????w??? > > ???j:+v???w???????? ????zZ+???????j"????i ��.n��������+%������w��{.n����z��u���ܨ}���Ơz�j:+v�����w����ޙ��&�)ߡ�a����z�ޗ���ݢj��w�f