Re: RESEND: Re: PG Balancer Upmap mode not working

Bryan Stillwell <bstillwell@xxxxxxxxxxx> · Wed, 11 Dec 2019 00:41:41 +0000

Rich,

What's your failure domain (osd? host? chassis? rack?) and how big is each of them?

For example I have a failure domain of type rack in one of my clusters with mostly even rack sizes:

# ceph osd crush rule dump | jq -r '.[].steps'
[
  {
    "op": "take",
    "item": -1,
    "item_name": "default~hdd"
  },
  {
    "op": "chooseleaf_firstn",
    "num": 0,
    "type": "rack"
  },
  {
    "op": "emit"
  }
]
[
  {
    "op": "take",
    "item": -35,
    "item_name": "default~ssd"
  },
  {
    "op": "chooseleaf_firstn",
    "num": 0,
    "type": "rack"
  },
  {
    "op": "emit"
  }
]

# ceph osd tree |grep rack
 -74       160.34999     rack rack1
 -73       160.68999     rack rack2
 -72       160.68999     rack rack3

If I have a single failure my rack sizes become uneven and the balancer can't achieve a perfect balance.  I even find it'll sometimes get into a loop which causes the mgr's to basically stop working.  Ideally I would like to see each rack become balanced within the rack instead of across the cluster.

Bryan

> On Dec 10, 2019, at 5:30 PM, Rich Bade <richard.bade@xxxxxxxxx> wrote:
> 
> Notice: This email is from an external sender.
> 
> 
> 
> I'm finding the same thing. The balancer used to work flawlessly, giving me a very even distribution with about 1% variance. Some time between 12.2.7 (maybe) and 12.2.12 it's stopped working.
> Here's a small selection of my osd's showing 47%-62% spread.
> 
> 210   hdd 7.27739  1.00000 7.28TiB 3.43TiB 3.84TiB 47.18 0.74 104
> 211   hdd 7.27739  1.00000 7.28TiB 3.96TiB 3.32TiB 54.39 0.85 118
> 212   hdd 7.27739  1.00000 7.28TiB 4.50TiB 2.77TiB 61.88 0.97 136
> 213   hdd 7.27739  1.00000 7.28TiB 4.06TiB 3.21TiB 55.85 0.87 124
> 214   hdd 7.27739  1.00000 7.28TiB 4.30TiB 2.98TiB 59.05 0.92 130
> 215   hdd 7.27739  1.00000 7.28TiB 4.41TiB 2.87TiB 60.54 0.95 134
>                     TOTAL 1.26PiB  825TiB  463TiB 64.01
> MIN/MAX VAR: 0.74/1.10  STDDEV: 3.22
> 
> $ sudo ceph balancer status
> {
>    "active": true,
>    "plans": [],
>    "mode": "upmap"
> }
> 
> I'm happy to add data or test things to get this bug fixed.
> 
> Rich
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx