Re: Re-weight Entire Cluster?

"Anthony D'Atri" <aad@xxxxxxxxxxxxxx> · Tue, 30 May 2017 13:20:13 -0700

OIC, thanks for providing the tree output.  From what you wrote originally it seemed plausible that you were mixing up the columns, which is not an uncommon thing to do.

If all of your OSD’s are the same size, and have a CRUSH weight of 1.0000, then you have just the usual OSD fullness distribution problem.

If you have other OSD’s in the cluster that are the same size as these but have different CRUSH weights, then you do have a problem.  Is that the case?  Feel free to privately email me your entire ceph osd tree output if you like, to avoid spamming the list.

— aad

> Hi Anthony,
> 
> When the OSDs were added it appears they were added with a crush weight of 1 so I believe we need to change the weighting as we are getting a lot of very full OSDs.
> 
> -21  20.00000         host somehost
> 216   1.00000             osd.216           up  1.00000          1.00000
> 217   1.00000             osd.217           up  1.00000          1.00000
> 218   1.00000             osd.218           up  1.00000          1.00000
> 219   1.00000             osd.219           up  1.00000          1.00000
> 220   1.00000             osd.220           up  1.00000          1.00000
> 221   1.00000             osd.221           up  1.00000          1.00000
> 222   1.00000             osd.222           up  1.00000          1.00000
> 223   1.00000             osd.223           up  1.00000          1.00000
> 
> -----Original Message-----
> From: Anthony D'Atri <aad@xxxxxxxxxxxxxx>
> Date: Tuesday, May 30, 2017 at 1:10 PM
> To: ceph-users <ceph-users@xxxxxxxxxxxxxx>
> Cc: Cave Mike <mcave@xxxxxxx>
> Subject: Re:  Re-weight Entire Cluster?
> 
> 
> 
>> It appears the current best practice is to weight each OSD according to it?s size (3.64 for 4TB drive, 7.45 for 8TB drive, etc).
> 
> OSD’s are created with those sorts of CRUSH weights by default, yes.  Which is convenient, but it’s import to know that those weights are arbitrary, and what really matters is how the weights of each OSD / host / rack compares to its siblings.  They are relative weights, not absolute capacities.
> 
>> As it turns out, it was not configured this way at all; all of the OSDs are weighted at 1.
> 
> Are you perhaps confusing CRUSH weights with override weights?  In the below example each OSD has a CRUSH weight of 3.48169, but the override reweight is 1.000.  The override ranges from 0 to 1.  It is admittedly confusing to have two different things called weight.  Ceph’s reweight-by-utilization eg. acts by adjusting the override reweight and not touching the CRUSH weights. 
> 
> ID  WEIGHT     TYPE NAME                        UP/DOWN REWEIGHT PRIMARY-AFFINITY
> -44   83.56055 host somehostname
> 936    3.48169     osd.936                           up  1.00000          1.00000
> 937    3.48169     osd.937                           up  1.00000          1.00000
> 938    3.48169     osd.938                           up  1.00000          1.00000
> 939    3.48169     osd.939                           up  1.00000          1.00000
> 940    3.48169     osd.940                           up  1.00000          1.00000
> 941    3.48169     osd.941                           up  1.00000          1.00000
> 
> If you see something similar, from “ceph osd tree”, then chances are that there’s no point in changing anything since with CRUSH weights, all that matters is how they compare across OSD’s/racks/hosts/etc..  So you could double all of them just for grins, and nothing in how the cluster operates would change.
> 
> — Anthony
> 
> 
> 

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com