Re: ceph mgr balancer bad distribution

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 01.03.2018 um 09:58 schrieb Dan van der Ster:
> On Thu, Mar 1, 2018 at 9:52 AM, Stefan Priebe - Profihost AG
> <s.priebe@xxxxxxxxxxxx> wrote:
>> Hi,
>>
>> Am 01.03.2018 um 09:42 schrieb Dan van der Ster:
>>> On Thu, Mar 1, 2018 at 9:31 AM, Stefan Priebe - Profihost AG
>>> <s.priebe@xxxxxxxxxxxx> wrote:
>>>> Hi,
>>>> Am 01.03.2018 um 09:03 schrieb Dan van der Ster:
>>>>> Is the score improving?
>>>>>
>>>>>     ceph balancer eval
>>>>>
>>>>> It should be decreasing over time as the variances drop toward zero.
>>>>>
>>>>> You mentioned a crush optimize code at the beginning... how did that
>>>>> leave your cluster? The mgr balancer assumes that the crush weight of
>>>>> each OSD is equal to its size in TB.
>>>>> Do you have any osd reweights? crush-compat will gradually adjust
>>>>> those back to 1.0.
>>>>
>>>> I reweighted them all back to their correct weight.
>>>>
>>>> Now the mgr balancer module says:
>>>> mgr[balancer] Failed to find further optimization, score 0.010646
>>>>
>>>> But as you can see it's heavily imbalanced:
>>>>
>>>>
>>>> Example:
>>>> 49   ssd 0.84000  1.00000   864G   546G   317G 63.26 1.13  49
>>>>
>>>> vs:
>>>>
>>>> 48   ssd 0.84000  1.00000   864G   397G   467G 45.96 0.82  49
>>>>
>>>> 45% usage vs. 63%
>>>
>>> Ahh... but look, the num PGs are perfectly balanced, which implies
>>> that you have a relatively large number of empty PGs.
>>>
>>> But regardless, this is annoying and I expect lots of operators to get
>>> this result. (I've also observed that the num PGs is gets balanced
>>> perfectly at the expense of the other score metrics.)
>>>
>>> I was thinking of a patch around here [1] that lets operators add a
>>> score weight on pgs, objects, bytes so we can balance how we like.
>>>
>>> Spandan: you were the last to look at this function. Do you think it
>>> can be improved as I suggested?
>>
>> Yes the PGs are perfectly distributed - but i think most of the people
>> would like to have a dsitribution by bytes and not pgs.
>>
>> Is this possible? I mean in the code there is already a dict for pgs,
>> objects and bytes - but i don't know how to change the logic. Just
>> remove the pgs and objects from the dict?
> 
> It's worth a try to remove the pgs and objects from this dict:
> https://github.com/ceph/ceph/blob/luminous/src/pybind/mgr/balancer/module.py#L552

Do i have to change this 3 to 1 cause we have only one item in the dict?
I'm not sure where the 3 comes from.
        pe.score /= 3 * len(roots)


> You can update that directly in the python code on your mgr's. Turn
> the ceph balancer off then failover to the next mgr so it reloads the
> module. Then:
> 
> ceph balancer eval
> ceph balancer optimize myplan
> ceph balancer eval myplan
> 
> Does it move in the right direction?
> 
> -- dan
> 
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux