Re: Balancer module not balancing perfectly

Steve Taylor <steve.taylor@xxxxxxxxxxxxxxxx> · Tue, 30 Oct 2018 16:11:48 +0000

I had played with those settings some already, but I just tried again
with max_deviation set to 0.0001 and max_iterations set to 1000. Same
result. Thanks for the suggestion though.

Steve Taylor | Senior Software Engineer | StorageCraft Technology Corporation
380 Data Drive Suite 300 | Draper | Utah | 84020
Office: 801.871.2799 | 

If you are not the intended recipient of this message or received it erroneously, please notify the sender and delete it, together with any attachments, and be advised that any dissemination or copying of this message is prohibited.

On Tue, 2018-10-30 at 12:06 -0400, David Turner wrote:
> From the balancer module's code for v 12.2.7 I noticed [1] these
> lines which reference [2] these 2 config options for upmap. You might
> try using more max iterations or a smaller max deviation to see if
> you can get a better balance in your cluster. I would try to start
> with [3] these commands/values and see if it improves your balance
> and/or allows you to generate a better map.
> 
> [1] 
> https://github.com/ceph/ceph/blob/v12.2.7/src/pybind/mgr/balancer/module.py#L671-L672
> [2] upmap_max_iterations (default 10)
> upmap_max_deviation (default .01)
> 
> [3] ceph config-key set mgr/balancer/upmap_max_iterations 50
> ceph config-key set mgr/balancer/upmap_max_deviation .005
> 
> On Tue, Oct 30, 2018 at 11:14 AM Steve Taylor <
> steve.taylor@xxxxxxxxxxxxxxxx> wrote:
> > I have a Luminous 12.2.7 cluster with 2 EC pools, both using k=8
> > and
> > m=2. Each pool lives on 20 dedicated OSD hosts with 18 OSDs each.
> > Each
> > pool has 2048 PGs and is distributed across its 360 OSDs with host
> > failure domains. The OSDs are identical (4TB) and are weighted with
> > default weights (3.73).
> > 
> > Initially, and not surprisingly, the PG distribution was all over
> > the
> > place with PG counts per OSD ranging from 40 to 83. I enabled the
> > balancer module in upmap mode and let it work its magic, which
> > reduced
> > the range of the per-OSD PG counts to 56-61.
> > 
> > While 56-61 is obviously a whole lot better than 40-83, with upmap
> > I
> > expected the range to be 56-57. If I run 'ceph balancer optimize
> > <plan>' again to attempt to create a new plan I get 'Error
> > EALREADY:
> > Unable to find further optimization,or distribution is already
> > perfect.' I set the balancer's max_misplaced value to 1 in case
> > that
> > was preventing further optimization, but I still get the same
> > error.
> > 
> > I'm sure I'm missing some config option or something that will
> > allow it
> > to do better, but thus far I haven't been able to find anything in
> > the
> > docs, mailing list archives, or balancer source code that helps.
> > Any
> > ideas?
> > 
> > 
> > Steve Taylor | Senior Software Engineer | StorageCraft Technology
> > Corporation
> > 380 Data Drive Suite 300 | Draper | Utah | 84020
> > Office: 801.871.2799 | 
> > 
> > If you are not the intended recipient of this message or received
> > it erroneously, please notify the sender and delete it, together
> > with any attachments, and be advised that any dissemination or
> > copying of this message is prohibited.
> > 
> > 
> > 
> > _______________________________________________
> > ceph-users mailing list
> > ceph-users@xxxxxxxxxxxxxx
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> 
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com