Re: ceph mgr balancer bad distribution

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



nice thanks will try that soon.

Can you tell me how to change the log lever to info for the balancer module?

Am 01.03.2018 um 11:30 schrieb Dan van der Ster:
> On Thu, Mar 1, 2018 at 10:40 AM, Dan van der Ster <dan@xxxxxxxxxxxxxx> wrote:
>> On Thu, Mar 1, 2018 at 10:38 AM, Dan van der Ster <dan@xxxxxxxxxxxxxx> wrote:
>>> On Thu, Mar 1, 2018 at 10:24 AM, Stefan Priebe - Profihost AG
>>> <s.priebe@xxxxxxxxxxxx> wrote:
>>>>
>>>> Am 01.03.2018 um 09:58 schrieb Dan van der Ster:
>>>>> On Thu, Mar 1, 2018 at 9:52 AM, Stefan Priebe - Profihost AG
>>>>> <s.priebe@xxxxxxxxxxxx> wrote:
>>>>>> Hi,
>>>>>>
>>>>>> Am 01.03.2018 um 09:42 schrieb Dan van der Ster:
>>>>>>> On Thu, Mar 1, 2018 at 9:31 AM, Stefan Priebe - Profihost AG
>>>>>>> <s.priebe@xxxxxxxxxxxx> wrote:
>>>>>>>> Hi,
>>>>>>>> Am 01.03.2018 um 09:03 schrieb Dan van der Ster:
>>>>>>>>> Is the score improving?
>>>>>>>>>
>>>>>>>>>     ceph balancer eval
>>>>>>>>>
>>>>>>>>> It should be decreasing over time as the variances drop toward zero.
>>>>>>>>>
>>>>>>>>> You mentioned a crush optimize code at the beginning... how did that
>>>>>>>>> leave your cluster? The mgr balancer assumes that the crush weight of
>>>>>>>>> each OSD is equal to its size in TB.
>>>>>>>>> Do you have any osd reweights? crush-compat will gradually adjust
>>>>>>>>> those back to 1.0.
>>>>>>>>
>>>>>>>> I reweighted them all back to their correct weight.
>>>>>>>>
>>>>>>>> Now the mgr balancer module says:
>>>>>>>> mgr[balancer] Failed to find further optimization, score 0.010646
>>>>>>>>
>>>>>>>> But as you can see it's heavily imbalanced:
>>>>>>>>
>>>>>>>>
>>>>>>>> Example:
>>>>>>>> 49   ssd 0.84000  1.00000   864G   546G   317G 63.26 1.13  49
>>>>>>>>
>>>>>>>> vs:
>>>>>>>>
>>>>>>>> 48   ssd 0.84000  1.00000   864G   397G   467G 45.96 0.82  49
>>>>>>>>
>>>>>>>> 45% usage vs. 63%
>>>>>>>
>>>>>>> Ahh... but look, the num PGs are perfectly balanced, which implies
>>>>>>> that you have a relatively large number of empty PGs.
>>>>>>>
>>>>>>> But regardless, this is annoying and I expect lots of operators to get
>>>>>>> this result. (I've also observed that the num PGs is gets balanced
>>>>>>> perfectly at the expense of the other score metrics.)
>>>>>>>
>>>>>>> I was thinking of a patch around here [1] that lets operators add a
>>>>>>> score weight on pgs, objects, bytes so we can balance how we like.
>>>>>>>
>>>>>>> Spandan: you were the last to look at this function. Do you think it
>>>>>>> can be improved as I suggested?
>>>>>>
>>>>>> Yes the PGs are perfectly distributed - but i think most of the people
>>>>>> would like to have a dsitribution by bytes and not pgs.
>>>>>>
>>>>>> Is this possible? I mean in the code there is already a dict for pgs,
>>>>>> objects and bytes - but i don't know how to change the logic. Just
>>>>>> remove the pgs and objects from the dict?
>>>>>
>>>>> It's worth a try to remove the pgs and objects from this dict:
>>>>> https://github.com/ceph/ceph/blob/luminous/src/pybind/mgr/balancer/module.py#L552
>>>>
>>>> Do i have to change this 3 to 1 cause we have only one item in the dict?
>>>> I'm not sure where the 3 comes from.
>>>>         pe.score /= 3 * len(roots)
>>>>
>>>
>>> I'm pretty sure that 3 is just for our 3 metrics. Indeed you can
>>> change that to 1.
>>>
>>> I'm trying this on our test cluster here too. The last few lines of
>>> output from `ceph balancer eval-verbose` will confirm that the score
>>> is based only on bytes.
>>>
>>> But I'm not sure this is going to work -- indeed the score here went
>>> from ~0.02 to 0.08, but the do_crush_compat doesn't manage to find a
>>> better score.
>>
>> Maybe this:
>>
>> https://github.com/ceph/ceph/blob/luminous/src/pybind/mgr/balancer/module.py#L682
>>
>> I'm trying with that = 'bytes'
> 
> That seems to be working. I sent this PR as a start
> https://github.com/ceph/ceph/pull/20665
> 
> I'm not sure we need to mess with the score function, on second thought.
> 
> -- dan
> 
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux