On 29/03/18 09:25, ST Wong (ITSC) wrote: > Hi all, > > We put 8 (4+4) OSD and 5 (2+3) MON servers in server rooms in 2 buildings for redundancy. The buildings are connected through direct connection. > > While servers in each building have alternate uplinks. What will happen in case the link between the buildings is broken (application servers in each server room will continue to write to OSDs in the same room) ? > > Thanks a lot. The 3 mons in your second building will be able to remain quorate (as 3 is a majority of 5) and keep running the cluster. The other 2 mons will refuse to do anything since they can't find enough other monitors to form quorum. For PGs that have enough replicas in the 3-mon building to be above min_size, they will continue to serve I/O; however, PGs with less than min_size copies available will block I/O until you either bring the link back, or the missing OSDs are manually/automatically marked out and enough time passes for them to recover up to enough replicas on the working side. As far as anything in the 2-mon building is concerned ceph will be entirely nonfunctional. Recovery would propagate any changes made on the working side when the link comes back up. Ceph is designed to avoid split brain scenarios to protect data consistency, but the consequence is that if your cluster does get partitioned, a lot of it may stop working. You can design crush rules to help mitigate impact in the working part (for instance making sure that every PG places enough copies of itself on the 3-mon side that it will be able to continue serving I/O if the other building is lost) but you will never have a situation where the cluster is split into two and both sides continue operating and then join back up. Rich
Attachment:
signature.asc
Description: OpenPGP digital signature
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com