Hi Niklas, To explain the 33% misplaced objects after you move a host to another DC, one would have to check the current crush rule (ceph osd getcrushmap | crushtool -d -) and to which OSDs PGs are mapped to before and after the move operation (ceph pg dump). Regarding the replicated crush rule that Ceph creates by default, the rule puts a replica on every host under the 'default' root (failure domain being host). # rules rule replicated_rule { id 0 type replicated step take default step chooseleaf firstn 0 type host step emit } If you're using size 3 and min_size 2, and want to make sure your cluster continues to serve IOs with 2 datacenters being down then you need to make sure that these 2 datacenters only host 1 replica. You could group these 2 datacenters in 1 zone and all other datacenters in another zone and distribute 1 replica in zone 1 and the 2 other replicas in any of the datacenters of the other zone. For example: root default region FSN zone FSN1 datacenter FSN1-DC1 host machine-1 osd.0 ... 10 OSDs per datacenter ... currently 1 machine per datacenter datacenter FSN1-DC2 host machine-2 ... zone FSN2 ... other 8 datacenters Then create a new rule as per below: # rules rule replicated_zone { id 1 type replicated step take FSN1 step chooseleaf firstn 1 type datacenter step emit step take FSN2 step chooseleaf firstn 2 type datacenter step emit } Then you'd just have to change the pool(s) crush rule and wait for data movement. ceph osd pool set rbd_zone crush_rule replicated_zone Note that you can use crushtool [1] to simulate PG mapping and check the new crush rule before applying it to any pools. Regards, Frédéric. [1] https://docs.ceph.com/en/reef/man/8/crushtool/ ----- Le 4 Nov 24, à 17:01, Niklas Hambüchen mail@xxxxxx a écrit : > Hi Joachim, > > I'm currently looking for the general methodology and if it's possible without > rebalancing everything. > > But of course I'd also appreciate tips directly for my deployment; here is the > info: > > Ceph 18, Simple 3-replication (osd_pool_default_size = 3, default CRUSH rules > Ceph creates for that). > > Failure domains from `ceph osd tree`: > > root default > region FSN > zone FSN1 > datacenter FSN1-DC1 > host machine-1 > osd.0 > ... 10 OSDs per datacenter > ... currently 1 machine per datacenter > datacenter FSN1-DC2 > host machine-2 > ... > ... currently 8 datacenters > > I already tried simply > > ceph osd crush move machine-1 datacenter=FSN1-DC2 > > to "simulate" that DC1 and DC2 are temporarily the same failure domain > (machine-1 is the only machine in DC1 currently), but that immediately causes > 33% of objects to be misplaced -- much more movement than I'd hope for and more > than would be needed (I'd expect 12.5% would need to be moved given that 1 out > of 8 DCs needs to be moved). > > Thanks! > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx