Rich, What's your failure domain (osd? host? chassis? rack?) and how big is each of them? For example I have a failure domain of type rack in one of my clusters with mostly even rack sizes: # ceph osd crush rule dump | jq -r '.[].steps' [ { "op": "take", "item": -1, "item_name": "default~hdd" }, { "op": "chooseleaf_firstn", "num": 0, "type": "rack" }, { "op": "emit" } ] [ { "op": "take", "item": -35, "item_name": "default~ssd" }, { "op": "chooseleaf_firstn", "num": 0, "type": "rack" }, { "op": "emit" } ] # ceph osd tree |grep rack -74 160.34999 rack rack1 -73 160.68999 rack rack2 -72 160.68999 rack rack3 If I have a single failure my rack sizes become uneven and the balancer can't achieve a perfect balance. I even find it'll sometimes get into a loop which causes the mgr's to basically stop working. Ideally I would like to see each rack become balanced within the rack instead of across the cluster. Bryan > On Dec 10, 2019, at 5:30 PM, Rich Bade <richard.bade@xxxxxxxxx> wrote: > > Notice: This email is from an external sender. > > > > I'm finding the same thing. The balancer used to work flawlessly, giving me a very even distribution with about 1% variance. Some time between 12.2.7 (maybe) and 12.2.12 it's stopped working. > Here's a small selection of my osd's showing 47%-62% spread. > > 210 hdd 7.27739 1.00000 7.28TiB 3.43TiB 3.84TiB 47.18 0.74 104 > 211 hdd 7.27739 1.00000 7.28TiB 3.96TiB 3.32TiB 54.39 0.85 118 > 212 hdd 7.27739 1.00000 7.28TiB 4.50TiB 2.77TiB 61.88 0.97 136 > 213 hdd 7.27739 1.00000 7.28TiB 4.06TiB 3.21TiB 55.85 0.87 124 > 214 hdd 7.27739 1.00000 7.28TiB 4.30TiB 2.98TiB 59.05 0.92 130 > 215 hdd 7.27739 1.00000 7.28TiB 4.41TiB 2.87TiB 60.54 0.95 134 > TOTAL 1.26PiB 825TiB 463TiB 64.01 > MIN/MAX VAR: 0.74/1.10 STDDEV: 3.22 > > $ sudo ceph balancer status > { > "active": true, > "plans": [], > "mode": "upmap" > } > > I'm happy to add data or test things to get this bug fixed. > > Rich > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx