Re: unbalanced OSDs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Take a look at https://github.com/TheJJ/ceph-balancer

We switched to it after lot of attempts to make internal balancer work as expected and now we have ~even OSD utilization across cluster:

# ./placementoptimizer.py -v balance --ensure-optimal-moves --ensure-variance-decrease
[2023-08-03 23:33:27,954] gathering cluster state via ceph api...
[2023-08-03 23:33:36,081] running pg balancer
[2023-08-03 23:33:36,088] current OSD fill rate per crushclasses:
[2023-08-03 23:33:36,089]   ssd: average=49.86%, median=50.27%, without_placement_constraints=53.01%
[2023-08-03 23:33:36,090] cluster variance for crushclasses:
[2023-08-03 23:33:36,090]   ssd: 4.163
[2023-08-03 23:33:36,090] min osd.14 44.698%
[2023-08-03 23:33:36,090] max osd.22 51.897%
[2023-08-03 23:33:36,101] in descending full-order, couldn't empty osd.22, so we're done. if you want to try more often, set --max-full-move-attempts=$nr, this may unlock more balancing possibilities. [2023-08-03 23:33:36,101] --------------------------------------------------------------------------------
[2023-08-03 23:33:36,101] generated 0 remaps.
[2023-08-03 23:33:36,101] total movement size: 0.0B.
[2023-08-03 23:33:36,102] --------------------------------------------------------------------------------
[2023-08-03 23:33:36,102] old cluster variance per crushclass:
[2023-08-03 23:33:36,102]   ssd: 4.163
[2023-08-03 23:33:36,102] old min osd.14 44.698%
[2023-08-03 23:33:36,102] old max osd.22 51.897%
[2023-08-03 23:33:36,102] --------------------------------------------------------------------------------
[2023-08-03 23:33:36,103] new min osd.14 44.698%
[2023-08-03 23:33:36,103] new max osd.22 51.897%
[2023-08-03 23:33:36,103] new cluster variance:
[2023-08-03 23:33:36,103]   ssd: 4.163
[2023-08-03 23:33:36,103] --------------------------------------------------------------------------------


On 03.08.2023 16:38, Spiros Papageorgiou wrote:
On 03-Aug-23 12:11 PM, Eugen Block wrote:
ceph balancer status

I changed the PGs and it started rebalancing (and turned autoscaler off) , so now it will not report status:

It reports: "optimize_result": "Too many objects (0.088184 > 0.050000) are misplaced; try again later"

Lets wait a few hours to see what happens...

Thanx!

Sp

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux