Hi Paul, thanks for the detailed answer! Am Dienstag, den 14.08.2018, 12:23 +0200 schrieb Paul Emmerich: > IIRC this will create a rule that tries to selects n independent data > centers > Check the actual generated rule to validate this. This is what it did and looking back it makes sense. ;) My understanding of the CRUSH failure domain implementation changed daily while changing the rules and at the time of writing to the ceph-users list my understanding of it was wrong. > I think the only way to express "3 copies across two data centers" is > by explicitly > using the two data centers in the rule as in: > > (pseudo code) > take dc1 > chooseleaf 1 type host > emit > take dc2 > chooseleaf 2 type host > emit > > Which will always place 1 on dc1 and 2 in dc2. A rule like > > take default > choose 2 type datacenter > chooseleafe 2 type host > emit > > will select a total of 4 hosts in two different data centers (2 hosts > per dc) This is how I solved it in the end, my question was targeted at doing this without manually editing the CRUSH map. It’s okay that it does not work, but I’d rather ask if I can keep it simple. I used your second approach: step take default step choose firstn 0 type datacenter step chooseleaf firstn 2 type host step emit This way this will work if another datacenter is added (wishful thinking) and replication is increased from four to six without changing the CRUSH rule. > But the real problem here is that 2 data centers in one Ceph cluster > is just > a poor fit for Ceph in most scenarios. 3 would be fine. Two > independent > clusters and async rbd-mirror or rgw synchronization would also be > fine. > > But one cluster in two data centers and replicating via CRUSH just > isn't > how it works. > Maybe you are looking for something like "3 independent racks" and > you happen > to have two racks in each dc? Really depends on your setup and > requirements. Let me explain the setup a bit: I have two datacenters, each with one rack that has three nodes in it. I run with four copies replicated, so I have two copies in each datacenter and one can burn down without problems. I am aware of the monitor quorum problem. Yes, the storage is down if the “wrong” datacenter burns down. Still we just don’t have a third and it is not as important to stay online in disaster, we can recover manually. What matters most is that the data survives. Thanks again for your input, highly appreciated! Torsten -- Torsten Casselt, IT-Sicherheit, Leibniz Universität IT Services Tel: +49-(0)511-762-799095 Schlosswender Str. 5 Fax: +49-(0)511-762-3003 D-30159 Hannover
Attachment:
smime.p7s
Description: S/MIME cryptographic signature
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com