Re: stretched cluster or not, with mon in 3 DC and osds on 2 DC

Jan-Philipp Litza <jpl@xxxxxxxxx> · Mon, 14 Jun 2021 08:44:09 +0200

Hi,

since I just read that documentation page [1] on Friday, I can't tell
you anything that isn't on that page. But that particular problem of
which monitor gets elected should be solvable simply by using
connectivity election mode [2], shouldn't it?

Apart from the latency to the mon, the stretch cluster is mainly about
the failover characteristics of the OSDs: When DC1 or DC2 fails, without
a stretch cluster, the other DC will try to replicate all the data again
to reach size=4 again. With a stretch cluster, it will happily live with
size=2 until the other DC comes back online.

So when it's right to assume that if - god forbid - one of the DCs goes
offline, it will come back online not too long after again, so that the
cluster can live with size=2 during that phase, then a stretch cluster
probably is the better choice.

Also, as the documentation states, there are edge cases where even given
an appropriate CRUSH rule, size=4 min_size=2 don't necessarily mean you
have a live copy of every PG in each of the two DCs.

Best regards,
Jan-Philipp

[1]: https://docs.ceph.com/en/latest/rados/operations/stretch-mode/
[2]: https://docs.ceph.com/en/latest/rados/operations/change-mon-elections/
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx