RE: question about the 'r' value in CRUSH

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Okay, thank you for the clarification! :D

- - - - - - - - - - - - - - - - - - - -
Sangdi Xu
UIS 2, Team BORE

> -----Original Message-----
> From: Sage Weil [mailto:sweil@xxxxxxxxxx]
> Sent: Monday, January 25, 2016 9:25 PM
> To: xusangdi 11976 (RD)
> Cc: ceph-devel@xxxxxxxxxxxxxxx
> Subject: RE: question about the 'r' value in CRUSH
> 
> On Mon, 25 Jan 2016, Xusangdi wrote:
> > In normal scenario, this behavior is not that outstanding, but it does exist. Please see below:
> >
> > $ crushtool -o crushmap --build --num_osds 12 host straw2 3 root
> > straw2 0 $ crushtool -i crushmap --test --x 0 --num-rep 3
> > --show-mappings --weight 0 0
> [...]
> 
> The retries are normal.  You just need to leave choose_tries to a large enough value (default is now 50,
> I believe) so that you always get a full sized result.  It is a problem if you have a very small set of
> potentially suitable results, though (as with your 3-item tree with skewed weights, or perhaps a larger
> tree but with most of the OSDs marked 'out').
> 
> sage
> 
> 
> 
> > CRUSHCHOOSE_LEAF bucket -5 x 0 outpos 0 numrep 3 tries 51
> > recurse_tries 1 local_retries 0 local_fallback_retries 0 parent_r 0 stable 0  crush_bucket_choose -5
> x=0 r=0
> >   item -1 type 1
> > CHOOSE bucket -1 x 0 outpos 0 numrep 1 tries 1 recurse_tries 0
> > local_retries 0 local_fallback_retries 0 parent_r 0 stable 0  crush_bucket_choose -1 x=0 r=0
> >   item 0 type 0
> >   reject 1  collide 0  ftotal 1  flocal 1 skip rep CHOOSE returns 0
> >   reject 1  collide 0  ftotal 1  flocal 1  crush_bucket_choose -5 x=0
> > r=1
> >   item -4 type 1
> > CHOOSE bucket -4 x 0 outpos 0 numrep 1 tries 1 recurse_tries 0
> > local_retries 0 local_fallback_retries 0 parent_r 0 stable 0  crush_bucket_choose -4 x=0 r=0
> >   item 11 type 0
> > CHOOSE got 11
> > CHOOSE returns 1
> > CHOOSE got -4
> >  crush_bucket_choose -5 x=0 r=1 <== redundant try
> >   item -4 type 1
> >   reject 0  collide 1  ftotal 1  flocal 1  crush_bucket_choose -5 x=0
> > r=2
> >   item -4 type 1
> >   reject 0  collide 1  ftotal 2  flocal 1  crush_bucket_choose -5 x=0
> > r=3
> >   item -2 type 1
> > CHOOSE bucket -2 x 0 outpos 1 numrep 2 tries 1 recurse_tries 0
> > local_retries 0 local_fallback_retries 0 parent_r 0 stable 0  crush_bucket_choose -2 x=0 r=1
> >   item 5 type 0
> > CHOOSE got 5
> > CHOOSE returns 2
> > CHOOSE got -2
> >  crush_bucket_choose -5 x=0 r=2 <== redundant try
> >   item -4 type 1
> >   reject 0  collide 1  ftotal 1  flocal 1  crush_bucket_choose -5 x=0
> > r=3 <== redundant try
> >   item -2 type 1
> >   reject 0  collide 1  ftotal 2  flocal 1  crush_bucket_choose -5 x=0
> > r=4
> >   item -2 type 1
> >   reject 0  collide 1  ftotal 3  flocal 1  crush_bucket_choose -5 x=0
> > r=5
> >   item -1 type 1
> > CHOOSE bucket -1 x 0 outpos 2 numrep 3 tries 1 recurse_tries 0
> > local_retries 0 local_fallback_retries 0 parent_r 0 stable 0  crush_bucket_choose -1 x=0 r=2
> >   item 0 type 0
> >   reject 1  collide 0  ftotal 1  flocal 1 skip rep CHOOSE returns 2
> >   reject 1  collide 0  ftotal 4  flocal 1  crush_bucket_choose -5 x=0
> > r=6
> >   item -4 type 1
> >   reject 0  collide 1  ftotal 5  flocal 1  crush_bucket_choose -5 x=0
> > r=7
> >   item -4 type 1
> >   reject 0  collide 1  ftotal 6  flocal 1  crush_bucket_choose -5 x=0
> > r=8
> >   item -2 type 1
> >   reject 0  collide 1  ftotal 7  flocal 1  crush_bucket_choose -5 x=0
> > r=9
> >   item -4 type 1
> >   reject 0  collide 1  ftotal 8  flocal 1  crush_bucket_choose -5 x=0
> > r=10
> >   item -3 type 1
> > CHOOSE bucket -3 x 0 outpos 2 numrep 3 tries 1 recurse_tries 0
> > local_retries 0 local_fallback_retries 0 parent_r 0 stable 0  crush_bucket_choose -3 x=0 r=2
> >   item 7 type 0
> > CHOOSE got 7
> > CHOOSE returns 3
> > CHOOSE got -3
> > CHOOSE returns 3
> >  rule 0 x 0 [11,5,7]
> >
> > - - - - - - - - - - - - - - - - - - - - Sangdi Xu UIS 2, Team BORE
> >
> > > -----Original Message-----
> > > From: Sage Weil [mailto:sweil@xxxxxxxxxx]
> > > Sent: Sunday, January 24, 2016 10:35 PM
> > > To: xusangdi 11976 (RD)
> > > Cc: ceph-devel@xxxxxxxxxxxxxxx
> > > Subject: Re: question about the 'r' value in CRUSH
> > >
> > > On Sat, 23 Jan 2016, Xusangdi wrote:
> > > > Hi Sage,
> > > >
> > > > Recently we encountered an interesting case when learning about CRUSH, please see below:
> > > >
> > > > root root {
> > > >     id -4       # do not change unnecessarily
> > > >     # weight 36.000
> > > >     alg straw2
> > > >     hash 0  # rjenkins1
> > > >     item host0 weight 3.000
> > > >     item host1 weight 3.000
> > > >     item host2 weight 30.000
> > > > }
> > > >
> > > > CRUSHCHOOSE_LEAF bucket -4 x 3 outpos 0 numrep 3 tries 51
> > > > recurse_tries 1 local_retries 0 local_fallback_retries 0 parent_r
> > > > 0 stable 0  crush_bucket_choose -4
> > > x=3 r=0
> > > >   item -3 type 1
> > > > CHOOSE bucket -3 x 3 outpos 0 numrep 1 tries 1 recurse_tries 0
> > > > local_retries 0 local_fallback_retries 0 parent_r 0 stable 0  crush_bucket_choose -3 x=3 r=0
> > > >   item 8 type 0
> > > > CHOOSE got 8
> > > > CHOOSE returns 1
> > > > CHOOSE got -3
> > > >  crush_bucket_choose -4 x=3 r=1
> > > >   item -3 type 1
> > > >   reject 0  collide 1  ftotal 1  flocal 1  crush_bucket_choose -4
> > > > x=3
> > > > r=2
> > > >   item -3 type 1
> > > >   reject 0  collide 1  ftotal 2  flocal 1  crush_bucket_choose -4
> > > > x=3
> > > > r=3
> > > >   item -3 type 1
> > > >   reject 0  collide 1  ftotal 3  flocal 1  crush_bucket_choose -4
> > > > x=3
> > > > r=4
> > > >   item -3 type 1
> > > >   reject 0  collide 1  ftotal 4  flocal 1  crush_bucket_choose -4
> > > > x=3
> > > > r=5
> > > >   item -3 type 1
> > > >   reject 0  collide 1  ftotal 5  flocal 1  crush_bucket_choose -4
> > > > x=3
> > > > r=6
> > > >   item -3 type 1
> > > >   reject 0  collide 1  ftotal 6  flocal 1  crush_bucket_choose -4
> > > > x=3
> > > > r=7
> > > >   item -3 type 1
> > > >   reject 0  collide 1  ftotal 7  flocal 1  crush_bucket_choose -4
> > > > x=3
> > > > r=8
> > > >   item -3 type 1
> > > >   reject 0  collide 1  ftotal 8  flocal 1  crush_bucket_choose -4
> > > > x=3
> > > > r=9
> > > >   item -3 type 1
> > > >   reject 0  collide 1  ftotal 9  flocal 1  crush_bucket_choose -4
> > > > x=3
> > > > r=10
> > > >   item -3 type 1
> > > >   reject 0  collide 1  ftotal 10  flocal 1  crush_bucket_choose -4
> > > > x=3
> > > > r=11
> > > >   item -3 type 1
> > > >   reject 0  collide 1  ftotal 11  flocal 1  crush_bucket_choose -4
> > > > x=3
> > > > r=12
> > > >   item -3 type 1
> > > >   reject 0  collide 1  ftotal 12  flocal 1  crush_bucket_choose -4
> > > > x=3
> > > > r=13
> > > >   item -3 type 1
> > > >   reject 0  collide 1  ftotal 13  flocal 1  crush_bucket_choose -4
> > > > x=3
> > > > r=14
> > > >   item -3 type 1
> > > >   reject 0  collide 1  ftotal 14  flocal 1  crush_bucket_choose -4
> > > > x=3
> > > > r=15
> > > >   item -3 type 1
> > > >   reject 0  collide 1  ftotal 15  flocal 1  crush_bucket_choose -4
> > > > x=3
> > > > r=16
> > > >   item -3 type 1
> > > >   reject 0  collide 1  ftotal 16  flocal 1  crush_bucket_choose -4
> > > > x=3
> > > > r=17
> > > >   item -3 type 1
> > > >   reject 0  collide 1  ftotal 17  flocal 1  crush_bucket_choose -4
> > > > x=3
> > > > r=18
> > > >   item -3 type 1
> > > >   reject 0  collide 1  ftotal 18  flocal 1  crush_bucket_choose -4
> > > > x=3
> > > > r=19
> > > >   item -2 type 1
> > > > CHOOSE bucket -2 x 3 outpos 1 numrep 2 tries 1 recurse_tries 0
> > > > local_retries 0 local_fallback_retries 0 parent_r 0 stable 0  crush_bucket_choose -2 x=3 r=1
> > > >   item 4 type 0
> > > > CHOOSE got 4
> > > > CHOOSE returns 2
> > > > CHOOSE got -2
> > > >  crush_bucket_choose -4 x=3 r=2
> > > >   item -3 type 1
> > > >   reject 0  collide 1  ftotal 1  flocal 1  crush_bucket_choose -4
> > > > x=3
> > > > r=3
> > > >   item -3 type 1
> > > >   reject 0  collide 1  ftotal 2  flocal 1  crush_bucket_choose -4
> > > > x=3
> > > > r=4
> > > >   item -3 type 1
> > > >   reject 0  collide 1  ftotal 3  flocal 1  crush_bucket_choose -4
> > > > x=3
> > > > r=5
> > > >   item -3 type 1
> > > >   reject 0  collide 1  ftotal 4  flocal 1  crush_bucket_choose -4
> > > > x=3
> > > > r=6
> > > >   item -3 type 1
> > > >   reject 0  collide 1  ftotal 5  flocal 1  crush_bucket_choose -4
> > > > x=3
> > > > r=7
> > > >   item -3 type 1
> > > >   reject 0  collide 1  ftotal 6  flocal 1  crush_bucket_choose -4
> > > > x=3
> > > > r=8
> > > >   item -3 type 1
> > > >   reject 0  collide 1  ftotal 7  flocal 1  crush_bucket_choose -4
> > > > x=3
> > > > r=9
> > > >   item -3 type 1
> > > >   reject 0  collide 1  ftotal 8  flocal 1  crush_bucket_choose -4
> > > > x=3
> > > > r=10
> > > >   item -3 type 1
> > > >   reject 0  collide 1  ftotal 9  flocal 1  crush_bucket_choose -4
> > > > x=3
> > > > r=11
> > > >   item -3 type 1
> > > >   reject 0  collide 1  ftotal 10  flocal 1  crush_bucket_choose -4
> > > > x=3
> > > > r=12
> > > >   item -3 type 1
> > > >   reject 0  collide 1  ftotal 11  flocal 1  crush_bucket_choose -4
> > > > x=3
> > > > r=13
> > > >   item -3 type 1
> > > >   reject 0  collide 1  ftotal 12  flocal 1  crush_bucket_choose -4
> > > > x=3
> > > > r=14
> > > >   item -3 type 1
> > > >   reject 0  collide 1  ftotal 13  flocal 1  crush_bucket_choose -4
> > > > x=3
> > > > r=15
> > > >   item -3 type 1
> > > >   reject 0  collide 1  ftotal 14  flocal 1  crush_bucket_choose -4
> > > > x=3
> > > > r=16
> > > >   item -3 type 1
> > > >   reject 0  collide 1  ftotal 15  flocal 1  crush_bucket_choose -4
> > > > x=3
> > > > r=17
> > > >   item -3 type 1
> > > >   reject 0  collide 1  ftotal 16  flocal 1  crush_bucket_choose -4
> > > > x=3
> > > > r=18
> > > >   item -3 type 1
> > > >   reject 0  collide 1  ftotal 17  flocal 1  crush_bucket_choose -4
> > > > x=3
> > > > r=19
> > > >   item -2 type 1
> > > >   reject 0  collide 1  ftotal 18  flocal 1  crush_bucket_choose -4
> > > > x=3
> > > > r=20
> > > >   item -2 type 1
> > > >   reject 0  collide 1  ftotal 19  flocal 1  crush_bucket_choose -4
> > > > x=3
> > > > r=21
> > > >   item -3 type 1
> > > >   reject 0  collide 1  ftotal 20  flocal 1  crush_bucket_choose -4
> > > > x=3
> > > > r=22
> > > >   item -3 type 1
> > > >   reject 0  collide 1  ftotal 21  flocal 1  crush_bucket_choose -4
> > > > x=3
> > > > r=23
> > > >   item -1 type 1
> > > > CHOOSE bucket -1 x 3 outpos 2 numrep 3 tries 1 recurse_tries 0
> > > > local_retries 0 local_fallback_retries 0 parent_r 0 stable 0  crush_bucket_choose -1 x=3 r=2
> > > >   item 0 type 0
> > > > CHOOSE got 0
> > > > CHOOSE returns 3
> > > > CHOOSE got -1
> > > > CHOOSE returns 3
> > > >  rule 0 x 3 [8,4,0]
> > > >
> > > > It looks that when choosing the third replica, we repeat a lot of
> > > > tries which have already tried for the
> > > second one.
> > > > Is this intended or maybe an issue?
> > >
> > > It's because one bucket is weighted so much more heavily than the others.
> > > You're asking for something that's somewhat impossible: 3 replicas
> > > coming from 3 items with different weights.  Since we're sampling,
> > > if the weights are too skewed you'll run out of tries (r values) and it'll give up with only 2 results.
> > >
> > > sage
> > ----------------------------------------------------------------------
> > ---------------------------------------------------------------
> > 本邮件及其附件含有杭州华三通信技术有限公司的保密信息,仅限于发送给上面地址中列出
> > 的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、
> > 或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本
> > 邮件!
> > This e-mail and its attachments contain confidential information from
> > H3C, which is intended only for the person or entity whose address is
> > listed above. Any use of the information contained herein in any way
> > (including, but not limited to, total or partial disclosure,
> > reproduction, or dissemination) by persons other than the intended
> > recipient(s) is prohibited. If you receive this e-mail in error,
> > please notify the sender by phone or email immediately and delete it!
> > N?????r??y??????X??ǧv???)޺{.n?????z?]z????ay?ʇڙ??j ??f???h??????w???
> 
> ???j:+v???w???????? ????zZ+???????j"????i
��.n��������+%������w��{.n����z��u���ܨ}���Ơz�j:+v�����w����ޙ��&�)ߡ�a����z�ޗ���ݢj��w�f




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux