Am 21.08.2018 um 17:28 schrieb Gregory Farnum: > You should be able to create issues now; we had a misconfiguration in > the tracker following the recent spam attack. > -Greg > > On Tue, Aug 21, 2018 at 3:07 AM, Stefan Priebe - Profihost AG > <s.priebe@xxxxxxxxxxxx> wrote: >> >> Am 21.08.2018 um 12:03 schrieb Stefan Priebe - Profihost AG: >>> >>> Am 21.08.2018 um 11:56 schrieb Dan van der Ster: >>>> On Tue, Aug 21, 2018 at 11:54 AM Stefan Priebe - Profihost AG >>>> <s.priebe@xxxxxxxxxxxx> wrote: >>>>> >>>>> Am 21.08.2018 um 11:47 schrieb Dan van der Ster: >>>>>> On Mon, Aug 20, 2018 at 10:45 PM Stefan Priebe - Profihost AG >>>>>> <s.priebe@xxxxxxxxxxxx> wrote: >>>>>>> >>>>>>> >>>>>>> Am 20.08.2018 um 22:38 schrieb Dan van der Ster: >>>>>>>> On Mon, Aug 20, 2018 at 10:19 PM Stefan Priebe - Profihost AG >>>>>>>> <s.priebe@xxxxxxxxxxxx> wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> Am 20.08.2018 um 21:52 schrieb Sage Weil: >>>>>>>>>> On Mon, 20 Aug 2018, Stefan Priebe - Profihost AG wrote: >>>>>>>>>>> Hello, >>>>>>>>>>> >>>>>>>>>>> since loic seems to have left ceph development and his wunderful crush >>>>>>>>>>> optimization tool isn'T working anymore i'm trying to get a good >>>>>>>>>>> distribution with the ceph balancer. >>>>>>>>>>> >>>>>>>>>>> Sadly it does not work as good as i want. >>>>>>>>>>> >>>>>>>>>>> # ceph osd df | sort -k8 >>>>>>>>>>> >>>>>>>>>>> show 75 to 83% Usage which is 8% difference which is too much for me. >>>>>>>>>>> I'm optimization by bytes. >>>>>>>>>>> >>>>>>>>>>> # ceph balancer eval >>>>>>>>>>> current cluster score 0.005420 (lower is better) >>>>>>>>>>> >>>>>>>>>>> # ceph balancer eval $OPT_NAME >>>>>>>>>>> plan spriebe_2018-08-20_19:36 final score 0.005456 (lower is better) >>>>>>>>>>> >>>>>>>>>>> I'm unable to optimize further ;-( Is there any chance to optimize >>>>>>>>>>> further even in case of more rebelancing? >>>>>>>>>> >>>>>>>>>> The scoring that the balancer module is doing is currently a hybrid of pg >>>>>>>>>> count, bytes, and object count. Picking a single metric might help a bit >>>>>>>>>> (as those 3 things are not always perfectly aligned). >>>>>>>>> >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> ok i found a bug in the balancer code which seems to be present in all >>>>>>>>> releases. >>>>>>>>> >>>>>>>>> 861 best_ws = next_ws >>>>>>>>> 862 best_ow = next_ow >>>>>>>>> >>>>>>>>> >>>>>>>>> should be: >>>>>>>>> >>>>>>>>> 861 best_ws = copy.deepcopy(next_ws) >>>>>>>>> 862 best_ow = copy.deepcopy(next_ow) >>>>>>>>> >>>>>>>>> otherwise it does not use the best but the last. >>>>>>>> >>>>>>>> Interesting... does that change improve things? >>>>>>> >>>>>>> It fixes the following (mgr debug output): >>>>>>> 2018-08-20 22:33:46.078525 7f2fbc3b6700 0 mgr[balancer] Step result >>>>>>> score 0.001152 -> 0.001180, misplacing 0.000912 >>>>>>> 2018-08-20 22:33:46.078574 7f2fbc3b6700 0 mgr[balancer] Score got >>>>>>> worse, taking another step >>>>>>> 2018-08-20 22:33:46.078770 7f2fbc3b6700 0 mgr[balancer] Balancing root >>>>>>> default (pools ['cephstor2']) by bytes >>>>>>> 2018-08-20 22:33:46.156326 7f2fbc3b6700 0 mgr[balancer] Step result >>>>>>> score 0.001152 -> 0.001180, misplacing 0.000912 >>>>>>> 2018-08-20 22:33:46.156374 7f2fbc3b6700 0 mgr[balancer] Score got >>>>>>> worse, taking another step >>>>>>> 2018-08-20 22:33:46.156581 7f2fbc3b6700 0 mgr[balancer] Balancing root >>>>>>> default (pools ['cephstor2']) by bytes >>>>>>> 2018-08-20 22:33:46.233818 7f2fbc3b6700 0 mgr[balancer] Step result >>>>>>> score 0.001152 -> 0.001180, misplacing 0.000912 >>>>>>> 2018-08-20 22:33:46.233868 7f2fbc3b6700 0 mgr[balancer] Score got >>>>>>> worse, taking another step >>>>>>> 2018-08-20 22:33:46.234043 7f2fbc3b6700 0 mgr[balancer] Balancing root >>>>>>> default (pools ['cephstor2']) by bytes >>>>>>> 2018-08-20 22:33:46.313212 7f2fbc3b6700 0 mgr[balancer] Step result >>>>>>> score 0.001152 -> 0.001180, misplacing 0.000912 >>>>>>> 2018-08-20 22:33:46.313714 7f2fbc3b6700 0 mgr[balancer] Score got >>>>>>> worse, trying smaller step 0.000244 >>>>>>> 2018-08-20 22:33:46.313887 7f2fbc3b6700 0 mgr[balancer] Balancing root >>>>>>> default (pools ['cephstor2']) by bytes >>>>>>> 2018-08-20 22:33:46.391586 7f2fbc3b6700 0 mgr[balancer] Step result >>>>>>> score 0.001152 -> 0.001152, misplacing 0.001141 >>>>>>> 2018-08-20 22:33:46.393374 7f2fbc3b6700 0 mgr[balancer] Balancing root >>>>>>> default (pools ['cephstor2']) by bytes >>>>>>> 2018-08-20 22:33:46.473956 7f2fbc3b6700 0 mgr[balancer] Step result >>>>>>> score 0.001152 -> 0.001180, misplacing 0.000912 >>>>>>> 2018-08-20 22:33:46.474001 7f2fbc3b6700 0 mgr[balancer] Score got >>>>>>> worse, taking another step >>>>>>> 2018-08-20 22:33:46.474046 7f2fbc3b6700 0 mgr[balancer] Success, score >>>>>>> 0.001155 -> 0.001152 >>>>>>> >>>>>>> BUT: >>>>>>> # ceph balancer eval myplan >>>>>>> plan myplan final score 0.001180 (lower is better) >>>>>>> >>>>>>> So the final plan does NOT contain the expected optimization. The >>>>>>> deepcopy fixes it. >>>>>>> >>>>>>> After: >>>>>>> # ceph balancer eval myplan >>>>>>> plan myplan final score 0.001152 (lower is better) >>>>>>> >>>>>> >>>>>> OK that looks like a bug. Did you create a tracker or PR? >>>>> >>>>> No not yet. Should i create a PR on github with the fix? >>>> >>>> Yeah, probably tracker first (requesting luminous,mimic backports), >>>> then a PR on master with "Fixes: tracker..." Pull request: https://github.com/ceph/ceph/pull/23682 Tracker: http://tracker.ceph.com/issues/27000 Stefan >>> >>> Will do but can't find a create button in the tracker. I've opened >>> several reports in the past but right now it seems a can't create a ticket. >> >> >> http://tracker.ceph.com/projects/ceph/issues/new >> >> => >> >> 403 >> You are not authorized to access this page. >> >> >> >> >>> Stefan >>> >>>> >>>> -- dan >>>> >>>> >>>>> >>>>>> -- Dan >>>>>> >>>>>> >>>>>>>> >>>>>>>> Also, if most of your data is in one pool you can try ceph balancer >>>>>>>> eval <pool-name> >>>>>>> >>>>>>> Already tried this doesn't help much. >>>>>>> >>>>>>> Greets, >>>>>>> Stefan >>>>>>> >>>>>>> >>>>>>>> -- dan >>>>>>>> >>>>>>>>> >>>>>>>>> I'm also using this one: >>>>>>>>> https://github.com/ceph/ceph/pull/20665/commits/c161a74ad6cf006cd9b33b40fd7705b67c170615 >>>>>>>>> >>>>>>>>> to optimize by bytes only. >>>>>>>>> >>>>>>>>> Greets, >>>>>>>>> Stefan >> _______________________________________________ >> ceph-users mailing list >> ceph-users@xxxxxxxxxxxxxx >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com