Re: Stretch cluster questions

Gregory Farnum <gfarnum@xxxxxxxxxx> · Wed, 4 May 2022 20:19:17 -0700

On Wed, May 4, 2022 at 1:25 AM Eneko Lacunza <elacunza@xxxxxxxxx> wrote:

> Hi Gregory,
>
> El 3/5/22 a las 22:30, Gregory Farnum escribió:
>
> On Mon, Apr 25, 2022 at 12:57 AM Eneko Lacunza <elacunza@xxxxxxxxx> <elacunza@xxxxxxxxx> wrote:
>
> We're looking to deploy a stretch cluster for a 2-CPD deployment. I have
> read the following docs:https://docs.ceph.com/en/latest/rados/operations/stretch-mode/#stretch-clusters
>
> I have some questions:
>
> - Can we have multiple pools in a stretch cluster?
>
> Yes.
>
>
> - Can we have multiple different crush rules in a stretch cluster? I'm
> asking this because the command for stretch mode activation asks for a
> rule...
>
> Right, so what happens there is any pool with a default rule gets
> switched to the specified CRUSH rule. That doesn't stop you from
> changing the rule after engaging stretch mode, or giving it a
> non-default rule ahead of time. You just have to be careful to make
> sure it satisfies the stretch mode rules about placing across data
> centers.
>
> So the only purpose of the "stretch_rule" param to "enable_stretch_mode"
> is replacing default replicated rule?
>
> If so, when stretch mode is activated, the described special behaviour
> applies to all crush rules/pools:
>
> - the OSDs will only take PGs active when they peer across data centers
> (or whatever other CRUSH bucket type you specified), assuming both are alive
>
> - Pools will increase in size from the default 3 to 4, expecting 2 copies
> in each site (for size=2 pools will it increase to 4 too?)
>
> Is this accurate?
>

Yep! AFAIK most clusters just use the default pool for replication, barring
mixed media types. *shrug*

>
> We want to have different purpose pools on this Ceph cluster:
>
> - Important VM disks, with 2 copies in each DC (SSD class)
> - Ephemeral VM disks, with just 2 copies overall (SSD class)
> - Backup data in just one DC (HDD class).
>
> Objective of the 2-DC deployment is disaster recovery, HA isn't
> required, but I'll take it if deployment is reasonable :-) .
>
> I'm leery of this for the reasons described in the docs — if you don't
> have 2 replicas per site, you lose data availability every time an OSD
> goes down for any reason (or else you have a window while recovery
> happens where the data is not physically available in both sites,
> which rather negates the purpose).
>
>
> Is this because the following: "the OSDs will only take PGs active when
> they peer across data centers (or whatever other CRUSH bucket type you
> specified), assuming both are alive"?
>

>
> This conclusion wasn't obvious to me, but after your reply and a third
> read, now seems expected :-)
>

Right!

>
> Thanks for your comments. I'll be doing some tests, will write back if I
> find something unexpected (to me?) :-)
>
> Cheers
>
> Eneko Lacunza
> Zuzendari teknikoa | Director técnico
> Binovo IT Human Project
>
> Tel. +34 943 569 206 | https://www.binovo.esAstigarragako Bidea, 2 - 2º <https://www.google.com/maps/search/Astigarragako+Bidea,+2+-+2%C2%BA?entry=gmail&source=g> izda. Oficina 10-11, 20180 Oiartzun
> https://www.youtube.com/user/CANALBINOVOhttps://www.linkedin.com/company/37269706/
>
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx