Re: Balancing with upmap

Dan van der Ster <dan@xxxxxxxxxxxxxx> · Fri, 29 Jan 2021 23:44:17 +0100

Thanks, and thanks for the log file OTR which simply showed:

    2021-01-29 23:17:32.567 7f6155cae700  4 mgr[balancer] prepared 0/10 changes

This indeed means that balancer believes those pools are all balanced
according to the config (which you have set to the defaults).

Could you please also share the output of `ceph osd df tree` so we can
see the distribution and OSD weights?

You might need simply to decrease the upmap_max_deviation from the
default of 5. On our clusters we do:

    ceph config set mgr mgr/balancer/upmap_max_deviation 1

Cheers, Dan

On Fri, Jan 29, 2021 at 11:25 PM Francois Legrand <fleg@xxxxxxxxxxxxxx> wrote:
>
> Hi Dan,
>
> Here is the output of ceph balancer status :
>
> /ceph balancer status//
> //{//
> //    "last_optimize_duration": "0:00:00.074965", //
> //    "plans": [], //
> //    "mode": "upmap", //
> //    "active": true, //
> //    "optimize_result": "Unable to find further optimization, or
> pool(s) pg_num is decreasing, or distribution is already perfect", //
> //    "last_optimize_started": "Fri Jan 29 23:13:31 2021"//
> //}/
>
>
> F.
>
> Le 29/01/2021 à 10:57, Dan van der Ster a écrit :
> > Hi Francois,
> >
> > What is the output of `ceph balancer status` ?
> > Also, can you increase the debug_mgr to 4/5 then share the log file of
> > the active mgr?
> >
> > Best,
> >
> > Dan
> >
> > On Fri, Jan 29, 2021 at 10:54 AM Francois Legrand <fleg@xxxxxxxxxxxxxx> wrote:
> >> Thanks for your suggestion. I will have a look !
> >>
> >> But I am a bit surprised that the "official" balancer seems so unefficient !
> >>
> >> F.
> >>
> >> Le 28/01/2021 à 12:00, Jonas Jelten a écrit :
> >>> Hi!
> >>>
> >>> We also suffer heavily from this so I wrote a custom balancer which yields much better results:
> >>> https://github.com/TheJJ/ceph-balancer
> >>>
> >>> After you run it, it echoes the PG movements it suggests. You can then just run those commands the cluster will balance more.
> >>> It's kinda work in progress, so I'm glad about your feedback.
> >>>
> >>> Maybe it helps you :)
> >>>
> >>> -- Jonas
> >>>
> >>> On 27/01/2021 17.15, Francois Legrand wrote:
> >>>> Hi all,
> >>>> I have a cluster with 116 disks (24 new disks of 16TB added in december and the rest of 8TB) running nautilus 14.2.16.
> >>>> I moved (8 month ago) from crush_compat to upmap balancing.
> >>>> But the cluster seems not well balanced, with a number of pgs on the 8TB disks varying from 26 to 52 ! And an occupation from 35 to 69%.
> >>>> The recent 16 TB disks are more homogeneous with 48 to 61 pgs and space between 30 and 43%.
> >>>> Last week, I realized that some osd were maybe not using upmap because I did a ceph osd crush weight-set ls and got (compat) as result.
> >>>> Thus I ran a ceph osd crush weight-set rm-compat which triggered some rebalancing. Now there is no more recovery for 2 days, but the cluster is still unbalanced.
> >>>> As far as I understand, upmap is supposed to reach an equal number of pgs on all the disks (I guess weighted by their capacity).
> >>>> Thus I would expect more or less 30 pgs on the 8TB disks and 60 on the 16TB and around 50% usage on all. Which is not the case (by far).
> >>>> The problem is that it impact the free available space in the pools (264Ti while there is more than 578Ti free in the cluster) because free space seems to be based on space available before the first osd will be full !
> >>>> Is it normal ? Did I missed something ? What could I do ?
> >>>>
> >>>> F.
> >>>> _______________________________________________
> >>>> ceph-users mailing list -- ceph-users@xxxxxxx
> >>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
> >> _______________________________________________
> >> ceph-users mailing list -- ceph-users@xxxxxxx
> >> To unsubscribe send an email to ceph-users-leave@xxxxxxx
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx