Hi Janne, I use the default profile(2+1) and set failure-domain=host, is my best practice? Janne Johansson <icepic.dz@xxxxxxxxx> 于2020年6月26日周五 下午4:59写道: > Den fre 26 juni 2020 kl 10:32 skrev Zhenshi Zhou <deaderzzs@xxxxxxxxx>: > >> Hi all, >> >> I'm going to deploy a cluster with erasure code pool for cold storage. >> There are 3 servers for me to set up the cluster, 12 OSDs on each server. >> Does that mean the data is secure while 1/3 OSDs of the cluster is down, >> or only 2 of the OSDs is down , if I set the ec profile with k=4 and m=2. >> > > By default, crush will want to place each part (of 6 in your case for EC > 4+2) > on a host of its own, to maximize data safety. Since you can't do that > with 3 > hosts, you must make sure no more than 2 pieces end up on a single host > ever, > so you can't just move from failure-domain=host to domain=osd, since that > would place all 6 pieces on the same host but different OSDs which would be > bad. > > You need to make the crush rule pick two different OSDs per host, but not > more. > One way could be to make a tree where hosts has half of its OSDs in one > branch > and the other half in another (lets call it subhost in this example), then > you get 3*2 > subhosts, and you make crush pick placement from subhosts and it will > always put > two pieces per OSD host, never on the same OSD and it will allow one host > to be > down for a while. > > I would like to add that data is not very secure when you have no > redundancy at all > left. Machines will crash, they will require maintenance, patches, bios > updates and > things like that, and having NO redundancy while you have planned or > unplanned > downtime will be placing the data at huge risk, _any_ surprise in this > situation would > immediately lead to data loss. > > Also, if one box dies, the cluster can't run and can't recover until you > have a new > host back in, so you are already running at the edge of data safety in > your normal case. > Even if this will "work", ceph as being a cluster really should have N+1 > hosts or more > if your data split (replication factor or EC k+m) is equal to N. > > -- > May the most significant bit of your life be positive. > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx