Re: When Zero isn't 0 (Crush weight mysteries)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

On Wed, 21 Dec 2016 11:33:48 +0100 (CET) Wido den Hollander wrote:

> 
> > Op 21 december 2016 om 2:39 schreef Christian Balzer <chibi@xxxxxxx>:
> > 
> > 
> > 
> > Hello,
> > 
> > I just (manually) added 1 OSD each to my 2 cache-tier nodes. 
> > The plan was/is to actually do the data-migration at the least busiest day
> > in Japan, New Years (the actual holiday is January 2nd this year). 
> > 
> > So I was going to have everything up and in but at weight 0 initially.
> > 
> > Alas at the "ceph osd crush add osd.x0 0 host=ceph-0x" steps Ceph happily
> > started to juggle a few PGs (about 7 total) around, despite of course no
> > weight in the cluster changing at all.
> > No harm done (this is the fast and not too busy cache-tier after all), but
> > very much unexpected.
> > 
> > So which part of the CRUSH algorithm goes around and pulls weights out of
> > thin air?
> > 
> 
> It didn't, but the CRUSH topology changed. A CRUSH dev might have a better and detailed explanation, but although the item has a weight of 0 it is still a item to straw(2).
> 
> When drawing straws it never gets selected because of the weight of 0, but it is still there.
> 
Yes, that makes sense.

> Same goes when you set the weight of the OSD to 0 and remove it from CRUSH a few days later. That means that you have double rebalance.
> 
Minor rebalance the 2nd time around though, not like when setting it to 0.

> In your case it would be best to add the items to CRUSH with the right weight when you want them to start participating.
> 
In that particular case (just 2 OSDs) I could have done that, but you'll
find that people would like to prepare things in advance (sometimes over
prolonged periods of time when adding nodes) and then have everything go
wild at once so to move data around only one time.

This way also the OSD demons are already running, have done their
initial peering, reducing the impact to the cluster.

Christian

> Wido
> 
> > Christian
> > -- 
> > Christian Balzer        Network/Systems Engineer                
> > chibi@xxxxxxx   	Global OnLine Japan/Rakuten Communications
> > http://www.gol.com/
> > _______________________________________________
> > ceph-users mailing list
> > ceph-users@xxxxxxxxxxxxxx
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 


-- 
Christian Balzer        Network/Systems Engineer                
chibi@xxxxxxx   	Global OnLine Japan/Rakuten Communications
http://www.gol.com/
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux