Re: EC Profiles & DR

Rich Freeman <r-ceph@xxxxxxxxxxxx> · Wed, 6 Dec 2023 16:28:40 +0000

On Wed, Dec 6, 2023 at 9:25 AM Patrick Begou
<Patrick.Begou@xxxxxxxxxxxxxxxxxxxxxx> wrote:
>
> My understood was that k and m were for EC chunks not hosts. 🙁 Of
> course if k and m are hosts the best choice would be k=2 and m=2.

A few others have already replied - as they said if the failure domain
is set to host then it will put only one chunk per host for each PG.

You can get fancy to alter this behavior (something that hadn't
originally occurred to me), but you'll need to use care to ensure that
your host redundancy is what you want it to be.  The main reason to do
that I would think would either be as an interim configuration, or if
you want to have a different level of disk redundancy than host
redundancy.  There might be some use cases I'm not thinking of, but it
is definitely atypical.

In any case, if you start putting multiple chunks for a PG on a single
host, then you'll have to use care to ensure that you can achieve
whatever goals you have for host failures and high availability.
You'll lose more replicas when a single host goes down.

If anything, I think the more common pattern I've seen is to have even
more distribution of chunks than the host level, such as distributing
them by racks/switches/PDUs/datacenters/etc.

Personally if I had 5 nodes and they were balanced, I'd probably just
run k=2, m=2, or even just size=3, and set the failure domain to host.
I avoid manually editing crush maps but my needs are fairly
straightforward - I'm just using hdd/ssd device classes and every pool
is on one or the other with a host failure domain.  I'm also using
rook which abstracts it a bit further, but it isn't doing anything too
fancy with the actual pools besides mapping them to k8s entities and
running the various daemons.

--
Rich
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx