Re: Stretch cluster experiences in production?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello Matthew,

building strech clusters is not a big deal. It works quite well and stable
as long as you have your network under control. This is the most error
prone part of a stretch cluster but can easy be solved when you choose a
good vendor and network gear.

For 3 data centers make sure to have a dark fiber interconnect and avoid
things like managed Ethernet. Build a ring out of them using overlay
network technologies like EVPN BGP+ECMP+VXLAN and have all network path
identical and active. This provides a stable high available network and in
addition avoids different packet runtimes through your network.
After having the storage backbone capable of running in service upgrades
and downtime free operation, just configure the crush rule to 3 data
centers and use a crush rule with the correct host/OSD selection. Don't
forget to place your MONs and Services in all of these data centers as well.

As additional tuning, your crush rule can reflect a primary data center. So
if all your workload is in DC-A, you can configure it to have all primary
OSDs of a PG in this DC. This way your read access is always local and
reduces network congestion. In addition, your writes will be a litte bit
faster as well.

We have quite some experience with that and can be of help if you need more
details and vendor suggestions.

--
Martin Verges
Managing director

Mobile: +49 174 9335695  | Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263
Web: https://croit.io | YouTube: https://goo.gl/PGE1Bx


On Fri, 15 Oct 2021 at 17:22, Matthew Vernon <mvernon@xxxxxxxxxxxxx> wrote:

> Hi,
>
> Stretch clusters[0] are new in Pacific; does anyone have experience of
> using one in production?
>
> I ask because I'm thinking about new RGW cluster (split across two main
> DCs), which I would naturally be doing using RGW multi-site between two
> clusters.
>
> But it strikes me that a stretch cluster might be simpler (multi-site
> RGW isn't entirely straightforward e.g. round resharding), and 2 copies
> per site is quite a bit less storage than 3 per site. But I'm not sure
> if this new feature is considered production-deployment-ready
>
> Also, if I'm using RGWs, will they do the right thing location-wise?
> i.e. DC A RGWs will talk to DC A OSDs wherever possible?
>
> Thanks,
>
> Matthew
>
> [0] https://docs.ceph.com/en/latest/rados/operations/stretch-mode/
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux