I'm managing a ceph cluster with +1K OSDs distributed accross 56 host. Untill now the crush rule used is the default replicated rule, but I want to change that in order to implement failure domain on rack level. Ceph version: Pacific 16.2.15 All pools(RBD and CephFS) currently use the default replicated_rule All OSD hosts has 25G network, and spinning disks(HDD) MON DBs is on NVMEs Workload is 24x7x365 The builtin balancer is disabled, and has been for a long time - Instead balancing has been done by a cron job executing - ceph osd reweight-by-utilization 112 0.05 30 Current plan is to * Disable rebalancing and backfilling by executing - ceph osd set norebalance; ceph osd set nobackfill; * Add all 7 Rack to crushmap and distribute the hosts (8 in each.) by using the built in commands like - * ceph osd crush add-bucket rack<#> rack root=default * ceph osd crush move osd-host<#> rack=rack<#> * Create the new rack split rule with command * ceph osd crush rule create-replicated rack_split default rack * Set the rule across all my pools * for p in $(ceph osd lspools | cut -d' ' -f 2) ; do echo $p $(ceph osd pool set $p crush_rule rack_split) ; done * I will probably also be using upmap-remapped.py here.. * Finally enable rebalancing, and backfilling - ceph osd unset norebalance; ceph osd unset nobackfill; However I'm concerned with the amount of data that needs to be rebalanced, since the cluster holds multiple PB, and I'm looking for review of/input for my plan, as well as words of advice/experience from someone who has been in a similar situation. Also I've seen some weird behavior - where Pacific(16) seems to do something different from Quincy(17) - In a test cluster I've tested the plan - On Pasific: Data is marked as "degraded", and not misplaced as expected. I also see above 2000% degraded data (but that might be another issue) On Quincy: Data is marked as misplaced - which seems correct. All experience and/or input will be greatly appreciated. _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx