Hi,
>how many servers are your osd's split over ? keep in mind that ceph's default picks one osd from each host. so you would need minimum 4 osd hosts in total to be able to use 4+2 pools and with only 4 hosts you have no failuredomain. but 4 hosts in the minimum sane starting point for a regular small cluster with 3+2 pools (you can loose a node and ceph selfheals as long as there are enough freespace.
We’ll have 8 servers to split over (4 in each room). Thanks.
Best Rgds,
/st wong
From: Ronny Aasen [mailto:ronny+ceph-users@
aasen.cx ]
Sent: Friday, March 30, 2018 3:18 AM
To: ST Wong (ITSC); ceph-users@xxxxxxxxxxxxxx
Subject: Re: split brain case
On 29.03.2018 11:13, ST Wong (ITSC) wrote:
Hi,
Thanks.
> ofcourse the 4 osd's left working now want to selfheal by recreating all objects stored on the 4 split off osd's and have a huge recovery job. and you may risk that the osd's goes into too_full error, unless you have free space in your osd's to recreate all the data in the defective part of the cluster. or they will be stuck in recovery mode until you get the second room running, this depends on your crush map.
Means we’ve to made 4 OSD machines sufficient space to hold all data and thus the usable space will be halved?
yes if you want to be able to be able to operatate one room as if it was the whole cluster (HA) then you need this.
also if you want to have 4+2 instead of 3+2 pool size to avoid the blocking during recovery, that would take a whole lot of ekstra space.
you can optionally let the cluster run degraded with 4+2 while one room is down. or temporary set pools to 2+2 while the other room is down, to reduce the space requirements.
> point in that slitting the cluster hurts. and if HA is the most important then you may want to check out rbd mirror.
Will consider when there is budget to setup another ceph cluster for rdb mirror.
i do not know your needs or applications, but while you only have 2 rooms you may just think of it as a single cluster that just happen to occupy 2 rooms. but with that few osd's you should perhaps just put the cluster in a single room
the pain of splitting a cluster down the middle is quite significant. and i would perhaps use resources to improve the redundancy of the networks between the buildings instead. have multiple paths between the buildings to prevent service disruption in the building that does not house the cluster.
having 5 mons is quite a lot. i think most clusters have 3 mons up into several hundred osd hosts
how many servers are your osd's split over ? keep in mind that ceph's default picks one osd from each host. so you would need minimum 4 osd hosts in total to be able to use 4+2 pools and with only 4 hosts you have no failuredomain. but 4 hosts in the minimum sane starting point for a regular small cluster with 3+2 pools (you can loose a node and ceph selfheals as long as there are enough freespace.
kind regards
Ronny Aasen
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph. com
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com