Re: ceph mgr balancer bad distribution

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Mar 1, 2018 at 9:52 AM, Stefan Priebe - Profihost AG
<s.priebe@xxxxxxxxxxxx> wrote:
> Hi,
>
> Am 01.03.2018 um 09:42 schrieb Dan van der Ster:
>> On Thu, Mar 1, 2018 at 9:31 AM, Stefan Priebe - Profihost AG
>> <s.priebe@xxxxxxxxxxxx> wrote:
>>> Hi,
>>> Am 01.03.2018 um 09:03 schrieb Dan van der Ster:
>>>> Is the score improving?
>>>>
>>>>     ceph balancer eval
>>>>
>>>> It should be decreasing over time as the variances drop toward zero.
>>>>
>>>> You mentioned a crush optimize code at the beginning... how did that
>>>> leave your cluster? The mgr balancer assumes that the crush weight of
>>>> each OSD is equal to its size in TB.
>>>> Do you have any osd reweights? crush-compat will gradually adjust
>>>> those back to 1.0.
>>>
>>> I reweighted them all back to their correct weight.
>>>
>>> Now the mgr balancer module says:
>>> mgr[balancer] Failed to find further optimization, score 0.010646
>>>
>>> But as you can see it's heavily imbalanced:
>>>
>>>
>>> Example:
>>> 49   ssd 0.84000  1.00000   864G   546G   317G 63.26 1.13  49
>>>
>>> vs:
>>>
>>> 48   ssd 0.84000  1.00000   864G   397G   467G 45.96 0.82  49
>>>
>>> 45% usage vs. 63%
>>
>> Ahh... but look, the num PGs are perfectly balanced, which implies
>> that you have a relatively large number of empty PGs.
>>
>> But regardless, this is annoying and I expect lots of operators to get
>> this result. (I've also observed that the num PGs is gets balanced
>> perfectly at the expense of the other score metrics.)
>>
>> I was thinking of a patch around here [1] that lets operators add a
>> score weight on pgs, objects, bytes so we can balance how we like.
>>
>> Spandan: you were the last to look at this function. Do you think it
>> can be improved as I suggested?
>
> Yes the PGs are perfectly distributed - but i think most of the people
> would like to have a dsitribution by bytes and not pgs.
>
> Is this possible? I mean in the code there is already a dict for pgs,
> objects and bytes - but i don't know how to change the logic. Just
> remove the pgs and objects from the dict?

It's worth a try to remove the pgs and objects from this dict:

https://github.com/ceph/ceph/blob/luminous/src/pybind/mgr/balancer/module.py#L552

You can update that directly in the python code on your mgr's. Turn
the ceph balancer off then failover to the next mgr so it reloads the
module. Then:

ceph balancer eval
ceph balancer optimize myplan
ceph balancer eval myplan

Does it move in the right direction?

-- dan
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux