Hi,
We have a ceph cluster with rack as failure domain but the racks are so imbalanced due to which we are not able to utilize the maximum of storage allocated as some odd's in small racks are filling up too fast and causing ceph to go into warning state and near_full_ratio being triggered.
We are planning to restructure the entire crushmap with Rows being the failure domain instead of Racks so that each row will have the same number of hosts irrespective of how many Racks we have in each Row. We are using 3X replica in our ceph cluster
Current:
Rack1 has 4 hosts
Rack 2 has 2 hosts
Rack 3 has 3 hosts
Rack 4 has 6 hosts
Rack 5 has 7 hosts
Rack 6 has 2 hosts
Rack 7 has 3 hosts
Future: With each Row having 9 hosts,
Row_A with Rack 1 + Rack 2 + Rack 3 = 9 Hosts
Row_B with Rack 4 + Rack 7 = 9 Hosts
Row_C with Rack 5 + Rack 6 = 9 Hosts
The question is how can we safely do that without triggering too much rebalance?
I can add empty rows to the crushmap and change failure domain to row without any rebalancing but when I move a rack under a row it is triggering 50-60% of rebalance and even the cluster is going completely out of (error: connecting to cluster). How can we avoid it?
Thanks,
Pardhiv Karri
"Rise and Rise again until LAMBS become LIONS"
"Rise and Rise again until LAMBS become LIONS"
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com