Bryan,
Try setting the config osd_calc_pg_upmaps_aggressively=false and
see if that helps with mgr getting wedged.
David
On 12/10/19 4:41 PM, Bryan Stillwell wrote:
Rich,
What's your failure domain (osd? host? chassis? rack?) and how big is each of them?
For example I have a failure domain of type rack in one of my clusters with mostly even rack sizes:
# ceph osd crush rule dump | jq -r '.[].steps'
[
{
"op": "take",
"item": -1,
"item_name": "default~hdd"
},
{
"op": "chooseleaf_firstn",
"num": 0,
"type": "rack"
},
{
"op": "emit"
}
]
[
{
"op": "take",
"item": -35,
"item_name": "default~ssd"
},
{
"op": "chooseleaf_firstn",
"num": 0,
"type": "rack"
},
{
"op": "emit"
}
]
# ceph osd tree |grep rack
-74 160.34999 rack rack1
-73 160.68999 rack rack2
-72 160.68999 rack rack3
If I have a single failure my rack sizes become uneven and the balancer can't achieve a perfect balance. I even find it'll sometimes get into a loop which causes the mgr's to basically stop working. Ideally I would like to see each rack become balanced within the rack instead of across the cluster.
Bryan
On Dec 10, 2019, at 5:30 PM, Rich Bade <richard.bade@xxxxxxxxx> wrote:
Notice: This email is from an external sender.
I'm finding the same thing. The balancer used to work flawlessly, giving me a very even distribution with about 1% variance. Some time between 12.2.7 (maybe) and 12.2.12 it's stopped working.
Here's a small selection of my osd's showing 47%-62% spread.
210 hdd 7.27739 1.00000 7.28TiB 3.43TiB 3.84TiB 47.18 0.74 104
211 hdd 7.27739 1.00000 7.28TiB 3.96TiB 3.32TiB 54.39 0.85 118
212 hdd 7.27739 1.00000 7.28TiB 4.50TiB 2.77TiB 61.88 0.97 136
213 hdd 7.27739 1.00000 7.28TiB 4.06TiB 3.21TiB 55.85 0.87 124
214 hdd 7.27739 1.00000 7.28TiB 4.30TiB 2.98TiB 59.05 0.92 130
215 hdd 7.27739 1.00000 7.28TiB 4.41TiB 2.87TiB 60.54 0.95 134
TOTAL 1.26PiB 825TiB 463TiB 64.01
MIN/MAX VAR: 0.74/1.10 STDDEV: 3.22
$ sudo ceph balancer status
{
"active": true,
"plans": [],
"mode": "upmap"
}
I'm happy to add data or test things to get this bug fixed.
Rich
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx