Re: ceph balancer: further optimizations?

Stefan Priebe - Profihost AG <s.priebe@xxxxxxxxxxxx> · Tue, 21 Aug 2018 12:03:29 +0200

Am 21.08.2018 um 11:56 schrieb Dan van der Ster:
> On Tue, Aug 21, 2018 at 11:54 AM Stefan Priebe - Profihost AG
> <s.priebe@xxxxxxxxxxxx> wrote:
>>
>> Am 21.08.2018 um 11:47 schrieb Dan van der Ster:
>>> On Mon, Aug 20, 2018 at 10:45 PM Stefan Priebe - Profihost AG
>>> <s.priebe@xxxxxxxxxxxx> wrote:
>>>>
>>>>
>>>> Am 20.08.2018 um 22:38 schrieb Dan van der Ster:
>>>>> On Mon, Aug 20, 2018 at 10:19 PM Stefan Priebe - Profihost AG
>>>>> <s.priebe@xxxxxxxxxxxx> wrote:
>>>>>>
>>>>>>
>>>>>> Am 20.08.2018 um 21:52 schrieb Sage Weil:
>>>>>>> On Mon, 20 Aug 2018, Stefan Priebe - Profihost AG wrote:
>>>>>>>> Hello,
>>>>>>>>
>>>>>>>> since loic seems to have left ceph development and his wunderful crush
>>>>>>>> optimization tool isn'T working anymore i'm trying to get a good
>>>>>>>> distribution with the ceph balancer.
>>>>>>>>
>>>>>>>> Sadly it does not work as good as i want.
>>>>>>>>
>>>>>>>> # ceph osd df | sort -k8
>>>>>>>>
>>>>>>>> show 75 to 83% Usage which is 8% difference which is too much for me.
>>>>>>>> I'm optimization by bytes.
>>>>>>>>
>>>>>>>> # ceph balancer eval
>>>>>>>> current cluster score 0.005420 (lower is better)
>>>>>>>>
>>>>>>>> # ceph balancer eval $OPT_NAME
>>>>>>>> plan spriebe_2018-08-20_19:36 final score 0.005456 (lower is better)
>>>>>>>>
>>>>>>>> I'm unable to optimize further ;-( Is there any chance to optimize
>>>>>>>> further even in case of more rebelancing?
>>>>>>>
>>>>>>> The scoring that the balancer module is doing is currently a hybrid of pg
>>>>>>> count, bytes, and object count.  Picking a single metric might help a bit
>>>>>>> (as those 3 things are not always perfectly aligned).
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> ok i found a bug in the balancer code which seems to be present in all
>>>>>> releases.
>>>>>>
>>>>>>  861                     best_ws = next_ws
>>>>>>  862                     best_ow = next_ow
>>>>>>
>>>>>>
>>>>>> should be:
>>>>>>
>>>>>>  861                     best_ws = copy.deepcopy(next_ws)
>>>>>>  862                     best_ow = copy.deepcopy(next_ow)
>>>>>>
>>>>>> otherwise it does not use the best but the last.
>>>>>
>>>>> Interesting... does that change improve things?
>>>>
>>>> It fixes the following (mgr debug output):
>>>> 2018-08-20 22:33:46.078525 7f2fbc3b6700  0 mgr[balancer] Step result
>>>> score 0.001152 -> 0.001180, misplacing 0.000912
>>>> 2018-08-20 22:33:46.078574 7f2fbc3b6700  0 mgr[balancer] Score got
>>>> worse, taking another step
>>>> 2018-08-20 22:33:46.078770 7f2fbc3b6700  0 mgr[balancer] Balancing root
>>>> default (pools ['cephstor2']) by bytes
>>>> 2018-08-20 22:33:46.156326 7f2fbc3b6700  0 mgr[balancer] Step result
>>>> score 0.001152 -> 0.001180, misplacing 0.000912
>>>> 2018-08-20 22:33:46.156374 7f2fbc3b6700  0 mgr[balancer] Score got
>>>> worse, taking another step
>>>> 2018-08-20 22:33:46.156581 7f2fbc3b6700  0 mgr[balancer] Balancing root
>>>> default (pools ['cephstor2']) by bytes
>>>> 2018-08-20 22:33:46.233818 7f2fbc3b6700  0 mgr[balancer] Step result
>>>> score 0.001152 -> 0.001180, misplacing 0.000912
>>>> 2018-08-20 22:33:46.233868 7f2fbc3b6700  0 mgr[balancer] Score got
>>>> worse, taking another step
>>>> 2018-08-20 22:33:46.234043 7f2fbc3b6700  0 mgr[balancer] Balancing root
>>>> default (pools ['cephstor2']) by bytes
>>>> 2018-08-20 22:33:46.313212 7f2fbc3b6700  0 mgr[balancer] Step result
>>>> score 0.001152 -> 0.001180, misplacing 0.000912
>>>> 2018-08-20 22:33:46.313714 7f2fbc3b6700  0 mgr[balancer] Score got
>>>> worse, trying smaller step 0.000244
>>>> 2018-08-20 22:33:46.313887 7f2fbc3b6700  0 mgr[balancer] Balancing root
>>>> default (pools ['cephstor2']) by bytes
>>>> 2018-08-20 22:33:46.391586 7f2fbc3b6700  0 mgr[balancer] Step result
>>>> score 0.001152 -> 0.001152, misplacing 0.001141
>>>> 2018-08-20 22:33:46.393374 7f2fbc3b6700  0 mgr[balancer] Balancing root
>>>> default (pools ['cephstor2']) by bytes
>>>> 2018-08-20 22:33:46.473956 7f2fbc3b6700  0 mgr[balancer] Step result
>>>> score 0.001152 -> 0.001180, misplacing 0.000912
>>>> 2018-08-20 22:33:46.474001 7f2fbc3b6700  0 mgr[balancer] Score got
>>>> worse, taking another step
>>>> 2018-08-20 22:33:46.474046 7f2fbc3b6700  0 mgr[balancer] Success, score
>>>> 0.001155 -> 0.001152
>>>>
>>>> BUT:
>>>> # ceph balancer eval myplan
>>>> plan myplan final score 0.001180 (lower is better)
>>>>
>>>> So the final plan does NOT contain the expected optimization. The
>>>> deepcopy fixes it.
>>>>
>>>> After:
>>>> # ceph balancer eval myplan
>>>> plan myplan final score 0.001152 (lower is better)
>>>>
>>>
>>> OK that looks like a bug. Did you create a tracker or PR?
>>
>> No not yet. Should i create a PR on github with the fix?
> 
> Yeah, probably tracker first (requesting luminous,mimic backports),
> then a PR on master with "Fixes: tracker..."

Will do but can't find a create button in the tracker. I've opened
several reports in the past but right now it seems a can't create a ticket.

Stefan

> 
> -- dan
> 
> 
>>
>>> -- Dan
>>>
>>>
>>>>>
>>>>> Also, if most of your data is in one pool you can try ceph balancer
>>>>> eval <pool-name>
>>>>
>>>> Already tried this doesn't help much.
>>>>
>>>> Greets,
>>>> Stefan
>>>>
>>>>
>>>>> -- dan
>>>>>
>>>>>>
>>>>>> I'm also using this one:
>>>>>> https://github.com/ceph/ceph/pull/20665/commits/c161a74ad6cf006cd9b33b40fd7705b67c170615
>>>>>>
>>>>>> to optimize by bytes only.
>>>>>>
>>>>>> Greets,
>>>>>> Stefan
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com