Den 2018-03-29 kl. 14:26, skrev David
Rabel:
Yes, but there are 2 OSDs available per PG per side of the partition, so 2 separate active clusters. If there is a different write to both of them it will be accepted and it will not be possible to heal the cluster later when the network issue is resolved because of inconsistency.On 29.03.2018 13:50, Peter Linder wrote:Den 2018-03-29 kl. 12:29, skrev David Rabel:On 29.03.2018 12:25, Janne Johansson wrote:2018-03-29 11:50 GMT+02:00 David Rabel <rabel@xxxxxxxxxxxxx>:You are right. But with my above example: If I have min_size 2 and size 4, and because of a network issue the 4 OSDs are split into 2 and 2, is it possible that I have write operations on both sides and therefore have inconsistent data?You always write to the primary, which in turn sends copies to the 3 others, so in the 2+2 split case, only one side can talk to the primary OSD for that pg, so writes will just happen on one side at most.I'm not sure that this is true, will not the side that doesn't have the primary simply elect a new one when min_size=2 and there are 2 of [failure domain] available? This is assuming that there are enough mon's also. Even if this is the case, only half of the PGs would be available and operations will stop.Why is this? If min_size is 2 and 2 PGs are available, operations should not stop. Or am I wrong here? Even if it was a 50/50 chance on which side a PG would be active (going by the original primary) it would mean trouble as many writes could not complete, but I don't think this is the case. You will have to take into account mon quorum as well of course, it is outside of my post. Best think I believe is to have an uneven number of everything. I don't know if you can have 4 OSD hosts and a 5:th node for quorum, I suppose it woule be worth it if the extra quorum node could not fail at the same time as 2 of the hosts. David |
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com