Hello., I've recently made the decision to gradually decommission my Nautilus cluster and migrate the hardware to a new Pacific or Quincy cluster. By gradually, I mean that as I expand the new cluster I will move (copy/erase) content from the old cluster to the new, making room to decommission more nodes and move them over. In order to do this I will, of course, need to remove OSD nodes by first emptying the OSDs on each node. I noticed that pgremapper (a version prior to October 2021) has a 'drain' subcommand that allows one to control which target OSDs would receive the PGs from the source OSD being drained. This seemed like a good idea: If one simply marks an OSD 'out', it's contents would be rebalanced to other OSDs on the same node that are still active, which seems like it would make a lot of unnecessary data movement and also make removing the next OSD take longer. So I went through the trouble of creating a 'really long' pgremapper drain command excluding the OSDs of two nodes as targets: # bin/pgremapper drain 16 --target-osds 00,01,02,03,04,05,06,07,24,25,16,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71 --allow-movement-across host --max-source-backfills 75 --concurrency 20 --verbose --yes However, when this is complete OSD 16 actually contains more PGs than before I started. It appears that the mapping generated by pgremapper also back-filled the OSD as it was draining it. So did I miss something here? What is the best way to proceed? I understand that it would be mayhem to mark 8 of 72 OSDs out and then turn backfill/rebalance/recover back on. But it seems like there should be a better way. Suggestions? Thanks. -Dave -- Dave Hall Binghamton University kdhall@xxxxxxxxxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx