One possibly relevant detail: the cluster has 8 nodes, and the new pool
I created uses k5 m2 erasure coding.
Vlad
On 4/9/20 11:28 AM, Vladimir Brik wrote:
Hello
I am running ceph 14.2.7 with balancer in crush-compat mode (needed
because of old clients), but it's doesn't seem to be doing anything. It
used to work in the past. I am not sure what changed. I created a big
pool, ~285TB stored, and it doesn't look like it ever got balanced:
pool 43 'fs-data-k5m2-hdd' erasure size 7 min_size 6 crush_rule 7
object_hash rjenkins pg_num 2048 pgp_num 2048 autoscale_mode warn
last_change 48647 lfor 0/42080/42102 flags
hashpspool,ec_overwrites,nearfull stripe_width 20480 application cephfs
OSD utilization varies between ~50% and about ~80%, with about 60% raw
used. I am using a mixture of 9TB and 14TB drives. Number of PGs/drive
varies 103 and 207.
# ceph osd df | grep hdd | sort -k 17 | (head -n 2; tail -n 2)
160 hdd 12.53519 1.00000 13 TiB 6.0 TiB 5.9 TiB 74 KiB 12 GiB 6.6
TiB 47.74 0.79 120 up
146 hdd 12.53519 1.00000 13 TiB 6.0 TiB 6.0 TiB 51 MiB 13 GiB 6.5
TiB 48.17 0.80 119 up
79 hdd 8.99799 1.00000 9.0 TiB 7.3 TiB 7.2 TiB 42 KiB 16 GiB 1.7
TiB 80.91 1.34 186 up
62 hdd 8.99799 1.00000 9.0 TiB 7.3 TiB 7.2 TiB 112 KiB 16 GiB 1.7
TiB 81.44 1.35 189 up
# ceph balancer status
{
"last_optimize_duration": "0:00:00.339635",
"plans": [],
"mode": "crush-compat",
"active": true,
"optimize_result": "Some osds belong to multiple subtrees: {0:
['default', 'default~hdd'], ...
"last_optimize_started": "Thu Apr 9 11:17:40 2020"
}
Does anybody know how to debug this?
Thanks,
Vlad
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx