Hello Matthew, building strech clusters is not a big deal. It works quite well and stable as long as you have your network under control. This is the most error prone part of a stretch cluster but can easy be solved when you choose a good vendor and network gear. For 3 data centers make sure to have a dark fiber interconnect and avoid things like managed Ethernet. Build a ring out of them using overlay network technologies like EVPN BGP+ECMP+VXLAN and have all network path identical and active. This provides a stable high available network and in addition avoids different packet runtimes through your network. After having the storage backbone capable of running in service upgrades and downtime free operation, just configure the crush rule to 3 data centers and use a crush rule with the correct host/OSD selection. Don't forget to place your MONs and Services in all of these data centers as well. As additional tuning, your crush rule can reflect a primary data center. So if all your workload is in DC-A, you can configure it to have all primary OSDs of a PG in this DC. This way your read access is always local and reduces network congestion. In addition, your writes will be a litte bit faster as well. We have quite some experience with that and can be of help if you need more details and vendor suggestions. -- Martin Verges Managing director Mobile: +49 174 9335695 | Chat: https://t.me/MartinVerges croit GmbH, Freseniusstr. 31h, 81247 Munich CEO: Martin Verges - VAT-ID: DE310638492 Com. register: Amtsgericht Munich HRB 231263 Web: https://croit.io | YouTube: https://goo.gl/PGE1Bx On Fri, 15 Oct 2021 at 17:22, Matthew Vernon <mvernon@xxxxxxxxxxxxx> wrote: > Hi, > > Stretch clusters[0] are new in Pacific; does anyone have experience of > using one in production? > > I ask because I'm thinking about new RGW cluster (split across two main > DCs), which I would naturally be doing using RGW multi-site between two > clusters. > > But it strikes me that a stretch cluster might be simpler (multi-site > RGW isn't entirely straightforward e.g. round resharding), and 2 copies > per site is quite a bit less storage than 3 per site. But I'm not sure > if this new feature is considered production-deployment-ready > > Also, if I'm using RGWs, will they do the right thing location-wise? > i.e. DC A RGWs will talk to DC A OSDs wherever possible? > > Thanks, > > Matthew > > [0] https://docs.ceph.com/en/latest/rados/operations/stretch-mode/ > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx