There is also another option that I like to use: - create a disjoint new crush root - set nodown, nouut, norebalance, nobackfill - possibly insert PG merging here; see below - osd crush move the hosts to the new root - wait for peering to finish - unset nodown, nouut, norebalance, nobackfill This will avoid *any* duplicate data movement. Setting the crush weight to 0 and then removing the OSDs after draining *will* lead to a second data movement after removing the OSDs. The crush placements will change again (at least it happened on my cluster when using the weight=0 drain+remove procedure), because these OSDs are still in the crush root and influence the hashing algorithm. In principle you should be able to do *everything* in one go by doing the PG merging in the step I indicated above. Then you only have a couple of peering storms and exactly one data movement into the final locations. Best regards, ================= Frank Schilder AIT Risø Campus Bygning 109, rum S14 ________________________________________ From: Matt Vandermeulen <storage@xxxxxxxxxxxx> Sent: 21 February 2022 23:31:49 To: Jason Borden Cc: ceph-users@xxxxxxx Subject: Re: Reducing ceph cluster size in half This might be easiest to work about in two steps: Draining hosts, and doing a PG merge. You can do it in either order (though thinking about it, doing the merge first will give you more cluster-wide resources to do it faster). Draining the hosts can be done in a few ways, too. If you want to do it in one shot, you can set nobackfill, then set the crush/reweights for the OSDs to zero, let the peering storm settle, and unset nobackfill. This is probably the easiest option if a brief peering storm and backfill_wait isn't a concern. If you want to reduce backfill_wait PGs, you can use something like `pgremapper drain`, but this will likely involve multiple data movements: The initial drain is fine, but the CRUSH removal of hosts will cause the upmaps to be lost (which can be `pgremapper cancel-backfill` away). Additional data movement will be needed if you want to `pgremapper undo-upmaps` to clean up what was canceled (or if you use the balancer and it wants to move things). On 2022-02-21 17:58, Jason Borden wrote: > Hi all, > > I'm looking for some advice on reducing my ceph cluster in half. I > currently have 40 hosts and 160 osds on a cephadm managed pacific > cluster. The storage space is only 12% utilized. I want to reduce the > cluster to 20 hosts and 80 osds while keeping the cluster operational. > I'd prefer to do this in as few operations as possible instead of > draining each host at a time and having to rebalance pgs 20 times. I > think I should probably half the number of pgs at the same time too. > Does anyone have any advice on how I can safely achieve this? > > Thanks, > Jason > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx