I think the mimic balancer doesn't include omap data when trying to balance the cluster. (Because it doesn't get usable omap stats from the cluster anyway; in Nautilus I think it does.) Are you using RGW or CephFS? -Greg On Wed, Jun 5, 2019 at 1:01 PM Josh Haft <paccrap@xxxxxxxxx> wrote: > > Hi everyone, > > On my 13.2.5 cluster, I recently enabled the ceph balancer module in > crush-compat mode. A couple manual 'eval' and 'execute' runs showed > the score improving, so I set the following and enabled the auto > balancer. > > mgr/balancer/crush_compat_metrics:bytes # from > https://github.com/ceph/ceph/pull/20665 > mgr/balancer/max_misplaced:0.01 > mgr/balancer/mode:crush-compat > > Log messages from the mgr showed lower scores with each iteration, so > I thought things were moving in the right direction. > > Initially my highest-utilized OSD was at 79% and MAXVAR was 1.17. I > let the balancer do its thing for 5 days, at which point my highest > utilized OSD was just over 90% and MAXVAR was about 1.28. > > I do have pretty low PG-per-OSD counts (average of about 60 - that's > next on my list), but I explicitly asked the balancer to use the bytes > metric. Was I just being impatient? Is it expected that usage would go > up overall for a time before starting to trend downward? Is my low PG > count affecting this somehow? I would have expected things to move in > the opposite direction pretty quickly as they do with 'ceph osd > reweight-by-utilization'. > > Thoughts? > > Regards, > Josh > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com