On 11/10/2017 7:17 AM, Sébastien
VIGNERON wrote:
Hi everyone,If that is your goal, then why 3 way replication?
min_size and max_size here doesn't do what you expect it to. You need to set min_size = 1 for a 2 way replicated cluster (beware of inconcistencies if the link between DC's go down) and = 2 for a 3 way replicated cluster, but the setting is on the pool, not in the crush rule.
This seems to, for each pg, take an osd on a host in DC-1 and then and osd on a host in DC-2, and then just a random osd on a random host anywhere. 50% of the extra osds selected will be in DC1 and the rest in DC2. When the link is cut, 50% of the pgs will not be able to fulfil the min_size = 2 requirement (depending on if the observer is in DC1 or DC2 of course) and operations will stop on these. This should in practice stop all operations, and Im not even considering monitor quorum yet. I don't really know why there is a difference here. We opted for a 3 way cluster in 3 separate datacenters though. Perhaps somehow you can simulate 2 separate datacenters in one of yours, at least make sure they are on different power circuits etc. Also, consider redundancy for your network so that is does not go down. Spanning tree is a little slow, but TRILL or SPB should work in your case.
|
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com