> You can use pg-remapper (https://github.com/digitalocean/pgremapper) or > similar tools to cancel the remapping; up-map entries will be created > that reflect the current state of the cluster. After all currently > running backfills are finished your mons should not be blocked anymore. > I would also disable the balancer temporarily since it will trigger new > backfills for those PG that are not at their optimal locations. After > mons are fine again you can just enable the balancer. This requires a > ceph release and ceph clients with up-map support. > Not tested in real life, but this approach might work. We use that approach at times, just so that there isn't a long long queue of PGs in the remapped state, and as far as I can tell, it is quite safe, You just programmatically tell each PG that there is an upmap entry for it telling it to be exactly where it is now, and then it isn't "misplaced" anymore. When you enable the balancer it will take a percentage of these and just remove their individual upmap entry, and they start to move as needed. If you want to have a small movement, set the max balancer to a really low value, and few PGs will be moving at the same time. If your wpq/mclock settings work ok for you, you can have a large percentage and let the IO scheduler prioritize for you. But as Burkhard says, setting "norebalance" for a moment, having the balancer disabled and then running one of these tools once or twice will make all PGs active+clean where they are, even if that isn't the desired end location for them. This should help your mons a lot, then enable the balancer and unset "norebalance" and let it finish the last PGs you have in the wrong spot. -- May the most significant bit of your life be positive. _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx