Hi Sinan, The safest approach would be to use the upmap-remapped.py tool developed by Dan at CERN. See [1] for details. The idea is to leverage the upmap load balancer to progressively migrate the data to the new servers, minimizing performance impact on the cluster and clients. I like to create the OSDs ahead of time on the nodes that I initially place in a root directory called ‘closet’. I then apply the norebalance flag (ceph osd set norebalance), disable the balancer (ceph balancer off), move the new nodes with already provisioned OSDs to their final location (rack), run ./upmap-remapped.py to bring all PGs back to active+clean state, remove the norebalance flag (ceph osd unset norebalance), re-enable the balancer (ceph balancer on) and watch data moving progressively as the upmap balancer executes its plans. Regards, Frédéric. [1] https://docs.clyso.com/blog/adding-capacity-with-upmap-remapped/ ----- Le 17 Mar 25, à 17:51, Sinan Polat sinan86polat@xxxxxxxxx a écrit : > Hello, > > I am currently managing a Ceph cluster that consists of 3 racks, each with > 4 OSD nodes. Each node contains 24 OSDs. I plan to add three new nodes, one > to each rack, to help alleviate the high OSD utilization. > > The current highest OSD utilization is 85%. I am concerned about the > possibility of any OSD reaching the osd_full_ratio threshold during the > rebalancing process. This would cause the cluster to enter a read-only > state, which I want to avoid at all costs. > > I am planning to execute the following commands: > > ceph orch host add new-node-1 > ceph orch host add new-node-2 > ceph orch host add new-node-3 > > ceph osd crush move new-node-1 rack=rack-1 > ceph osd crush move new-node-2 rack=rack-2 > ceph osd crush move new-node-3 rack=rack-3 > > ceph config set osd osd_max_backfills 1 > ceph config set osd osd_recovery_max_active 1 > ceph config set osd osd_recovery_sleep 0.1 > > ceph orch apply osd --all-available-devices > > Before proceeding, I would like to ask if the above steps are safe to > execute in a cluster with such high utilization. My main concern is whether > the rebalancing could cause any OSD to exceed the osd_full_ratio and result > in unexpected failures. > > Any insights or advice on how to safely add these nodes without impacting > cluster stability would be greatly appreciated. > > Thanks! > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx