Use a crush rule likes this for replica: 1) root default class XXX 2) choose 2 rooms 3) choose 2 disks That'll get you 4 OSDs in two rooms and the first 3 of these get data, the fourth will be ignored. That guarantees that losing a room will lose you at most 2 out of 3 copies. This is for disaster recovery only, it'll guarantee durability if you lose a room but not availability. 3+2 erasure coding cannot be split across two rooms in this way because, well, you need 3 out of 5 shards to survive, so you cannot lose half of them. Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90 On Thu, Nov 28, 2019 at 5:40 PM Francois Legrand <fleg@xxxxxxxxxxxxxx> wrote: > > Hi, > I have a cephfs in production based on 2 pools (data+metadata). > > Data is in erasure coding with the profile : > crush-failure-domain=host > crush-root=default > jerasure-per-chunk-alignment=false > k=3 > m=2 > plugin=jerasure > technique=reed_sol_van > w=8 > > Metadata is in replicated mode with k=3 > > The crush rules are as follow : > [ > { > "rule_id": 0, > "rule_name": "replicated_rule", > "ruleset": 0, > "type": 1, > "min_size": 1, > "max_size": 10, > "steps": [ > { > "op": "take", > "item": -1, > "item_name": "default" > }, > { > "op": "chooseleaf_firstn", > "num": 0, > "type": "host" > }, > { > "op": "emit" > } > ] > }, > { > "rule_id": 1, > "rule_name": "ec_data", > "ruleset": 1, > "type": 3, > "min_size": 3, > "max_size": 5, > "steps": [ > { > "op": "set_chooseleaf_tries", > "num": 5 > }, > { > "op": "set_choose_tries", > "num": 100 > }, > { > "op": "take", > "item": -1, > "item_name": "default" > }, > { > "op": "chooseleaf_indep", > "num": 0, > "type": "host" > }, > { > "op": "emit" > } > ] > } > ] > > When we installed it, everything was in the same room, but know we > splitted our cluster (6 servers but soon 8) in 2 rooms. Thus we updated > the crushmap by adding a room layer (with ceph osd crush add-bucket > room1 room etc) and move all our servers in the tree to the correct > place (ceph osd crush move server1 room=room1 etc...). > > Now, we would like to change the rules to set a failure domain to room > instead of host (to be sure that in case of disaster in one of the rooms > we will still have a copy in the other). > > What is the best strategy to do this ? > > F. > > > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx