Hey all, We are trying to get an erasure coding cluster up and running but we are having a problem getting the cluster to remain up if we lose an OSD host. Currently we have 6 OSD hosts with 6 OSDs a piece. I'm trying to build an EC profile and a crush rule that will allow the cluster to continue running if we lose a host, but I seem to misunderstand how the configuration of an EC pool/cluster is supposed to be implemented. I would like to be able to set this up to allow for 2 host failures before data loss occurs. Here is my crush rule: { "rule_id": 2, "rule_name": "EC_ENA", "ruleset": 2, "type": 3, "min_size": 6, "max_size": 8, "steps": [ { "op": "take", "item": -1, "item_name": "default" }, { "op": "choose_indep", "num": 4, "type": "host" }, { "op": "choose_indep", "num": 2, "type": "osd" }, { "op": "emit" } ] } Here is my EC profile: crush-device-class= crush-failure-domain=host crush-root=default jerasure-per-chunk-alignment=false k=6 m=2 plugin=jerasure technique=reed_sol_van w=8 Any direction or help would be greatly appreciated. Thanks, Tim Gipson Systems Engineer _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com