On 3/8/19 4:17 AM, Pardhiv Karri wrote: > Hi, > > We have a ceph cluster with rack as failure domain but the racks are so > imbalanced due to which we are not able to utilize the maximum of > storage allocated as some odd's in small racks are filling up too fast > and causing ceph to go into warning state and near_full_ratio being > triggered. > > We are planning to restructure the entire crushmap with Rows being the > failure domain instead of Racks so that each row will have the same > number of hosts irrespective of how many Racks we have in each Row. We > are using 3X replica in our ceph cluster > > Current: > Rack1 has 4 hosts > Rack 2 has 2 hosts > Rack 3 has 3 hosts > Rack 4 has 6 hosts > Rack 5 has 7 hosts > Rack 6 has 2 hosts > Rack 7 has 3 hosts > > Future: With each Row having 9 hosts, > > Row_A with Rack 1 + Rack 2 + Rack 3 = 9 Hosts > Row_B with Rack 4 + Rack 7 = 9 Hosts > Row_C with Rack 5 + Rack 6 = 9 Hosts > > The question is how can we safely do that without triggering too much > rebalance? > I can add empty rows to the crushmap and change failure domain to row > without any rebalancing but when I move a rack under a row it is > triggering 50-60% of rebalance and even the cluster is going completely > out of (error: connecting to cluster). How can we avoid it? > The cluster going down is not correct, even if you make such a big change you shouldn't see the cluster go down. It's normal that such operations trigger a large data migration. You want to change the topology of the cluster and thus it moves data. Wido > Thanks, > *Pardhiv Karri* > "Rise and Rise again untilLAMBSbecome LIONS" > > > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com