Re: Erasure coding scheme 2+4 = good idea?

Simon Kepp <simon@xxxxxxxxx> · Thu, 10 Oct 2024 02:41:58 +0200

Hi Andre,
Your setup and thoughts make good sense,but are somewhat unusual. Having
only two instances of your failure domain limits you quite a lot, and goes
against common best practices, which says at least 3 and preferably four.
It is understandable that you are limited to 2, when your fault domain is
as large as a datacenter, but it does limit your options somewhat. The
reason why you can't find information on erasure coding where k>m is
probably because of the low storage efficiency this provides, and people
tend to choose EC to improve storage efficiency.  Both setups that you
suggest yourself ( 4 x replication and 2+4 EC should meet your demands.
Given how close they are in storage efficiency ( 25% vs 33%), I would lean
towards 4x replication. It is a lot simpler to quickly understand the
degree of redundancy available, both during initial planning, and when
facing an actual potentially stressful error scenario with degraded
availability, compared to Erasure Coding which requires a bit more mental
arithmetic to figure out. how badly screwed you are in a given situation.
So assuming, that the ideal situation of spreading your OSDs across 3 or 4
data centers is not an option( this would allow you to safely go with the
default 3x replication), I would clearly lean towards 4x replication with 2
copies in each DC, but a 2+4 EC solution also seems like a viable choice
offering a slightly better storage efficiency, at the expense of increased
complexity.

Best Regards,
Simon Kepp,
Founder, CTO
Kepp Technologies

On Thu, Oct 10, 2024 at 1:17 AM Andre Tann <atann@xxxxxxxxxxxx> wrote:

> Hi all,
>
> I'd like your thoughts and comments on this idea:
>
> Setup:
> - two fault domains = DCs
> - connected with 100 GBit, < 1 ms
> - 80 NVME SSDs on each side
>
> Goal:
> One fault domain can be lost, and then there's still have some redundancy.
>
> Option 1:
> Replicated pool with size = 4. This gives me two copies on each side,
> thus meets the goal. But the efficiency is only 25%.
>
> Option 2:
> Erasure coded pool with 2 + 4 scheme. This gives me 3 chunks on each
> side. If I lose one side, I still have 3 chunks left, where I only need
> 2. Thus the goal is also met. Efficiency is 33%.
>
>
> Even though I did a lot of googling, I couldn't find anything about a
> similar setup. In all profiles, there is k <= m.
>
>
> What do you think about 2+4, is it a good idea or a bad one, or do I
> miss something and it doesn't work at all?
>
> In particular: is it possible to recover two data chunks out of 2 coding
> chunks? As I read the documentation, this should be no problem, just
> want to confirm.
>
> --
> Andre Tann
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx