hi folks, I currently test erasure-code-lrc (1) in a multi-room multi-rack setup. The idea is to be able to repair a disk-failures within the rack itself to lower bandwidth-usage ```bash ceph osd erasure-code-profile set lrc_hdd \ plugin=lrc \ crush-root=default \ crush-locality=rack \ crush-failure-domain=host \ crush-device-class=hdd \ mapping=__DDDDD__DDDDD__DDDDD__DDDDD \ layers=' [ [ "_cDDDDD_cDDDDD_cDDDDD_cDDDDD", "" ], [ "cDDDDDD_____________________", "" ], [ "_______cDDDDDD______________", "" ], [ "______________cDDDDDD_______", "" ], [ "_____________________cDDDDDD", "" ], ]' \ crush-steps='[ [ "choose", "room", 4 ], [ "choose", "rack", 1 ], [ "chooseleaf", "host", 7 ], ]' ``` The roule picks 4 out of 5 rooms and keeps the PG in one rack like expected! However it looks like the PG will not move to another Room if the PG is undersized or the entire Room or Rack is down! Questions: * do I miss something to allow LRC (PG's) to move across Racks/Rooms for repair? * Is it even possible to build such a 'Multi-stage' grushmap? Thanks for your help, Ansgar 1) https://docs.ceph.com/en/quincy/rados/operations/erasure-code-jerasure/ _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx