Re: Need help with upmap feature on luminous

Kári Bertilsson <karibertils@xxxxxxxxx> · Wed, 6 Feb 2019 04:48:43 +0000

ceph version 12.2.8-pve1 on proxmox

ceph osd df tree @ https://pastebin.com/e68fJ5fM

I added `debug mgr = 4/5` to [global] section in ceph.conf on the active mgr. And restarted mgr service. Is this correct ?

I noticed some config settings in the mgr logs.. Changed config to use
    "mgr/balancer/max_misplaced": "1",

    "mgr/balancer/mode": "upmap",

    "mgr/balancer/upmap_max_deviation": "0.0001",

    "mgr/balancer/upmap_max_iterations": "1000"

After this i get a larger plan. I tried running the upmap commands manually, but for some reason no data is getting moved ... all pgs are active+clean and few scrubbing. Maybe it wont run until scrub is over ?

I pasted a snippet of the mgr logs i found interesting...

Given that the new plan is good now the problem seems to be upmap directive is being ignored ?

On Wed, Feb 6, 2019 at 2:15 AM Konstantin Shalygin <k0ste@xxxxxxxx> wrote:

        I previously enabled upmap and used automatic balancing with "ceph balancer
on". I got very good results and OSD's ended up with perfectly distributed
pg's.

Now after adding several new OSD's, auto balancing does not seem to be
working anymore. OSD's have 30-50% usage where previously all had almost
the same %.

I turned off auto balancer and tried manually running a plan

# ceph balancer reset
# ceph balancer optimize myplan
# ceph balancer show myplan
ceph osd pg-upmap-items 41.1 106 125 95 121 84 34 36 99 72 126
ceph osd pg-upmap-items 41.5 12 121 65 3 122 52 5 126
ceph osd pg-upmap-items 41.b 117 99 65 125
ceph osd pg-upmap-items 41.c 49 121 81 131
ceph osd pg-upmap-items 41.e 61 82 73 52 122 46 84 118
ceph osd pg-upmap-items 41.f 71 127 15 121 56 82
ceph osd pg-upmap-items 41.12 81 92
ceph osd pg-upmap-items 41.17 35 127 71 44
ceph osd pg-upmap-items 41.19 81 131 21 119 18 52
ceph osd pg-upmap-items 41.25 18 52 37 125 40 3 41 34 71 127 4 128

After running this plan there's no difference and still huge inbalance on
the OSD's. Creating a new plan give the same plan again.

# ceph balancer eval
current cluster score 0.015162 (lower is better)

Balancer eval shows quite low number, so it seems to think the pg
distribution is already optimized ?

Since i'm not getting this working again. I looked into the offline
optimization at http://docs.ceph.com/docs/mimic/rados/operations/upmap/

I have 2 pools.
Replicated pool using 3 OSD's with "10k" device class.
And remaining OSD's have "hdd" device class.

The resulting out.txt creates a much larger plan, but would map alot of
PG's to the "10k" OSD's (where they should not be). And i can't seem to
find any way to exclude these 3 OSD's.

Any ideas how to proceed ?

    Please, paste (on pastebin) your `ceph osd df tree`. What is your
      ceph version?
    Also you can enable balancer debug messages if set `debug mgr =
      4/5` in your ceph.conf

    k

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com