Re: About number of osd node can be failed with erasure code 3+2

Wesley Dillingham <wes@xxxxxxxxxxxxxxxxx> · Mon, 27 Nov 2023 14:10:39 -0500

With a k+m which is 3+2 each RADOS object is broken into 5 shards. By
default the pool will have a min_size of k+1 (4 in this case). Which means
you can lose 1 shard and still be >= min_size. If one host goes down and
you use a host-based failure domain (default) you will lose 1 shard out of
all PGs on that host. You will now be at min_size and so still
readable/writeable. If you lose another host you will now be below min_size
with 3 healthy shards for some subset of PG (those common to the 2 hosts)
will be inactive and therefore not read/writeable. As you can see, the
higher your M the more disks/hosts you can lose before dropping below
min_size.

Respectfully,

*Wes Dillingham*
wes@xxxxxxxxxxxxxxxxx
LinkedIn <http://www.linkedin.com/in/wesleydillingham>

On Mon, Nov 27, 2023 at 1:36 PM <tranphong079@xxxxxxxxx> wrote:

> Hi Groups,
>
> Recently I was setting up a ceph cluster with 10 nodes 144 osd, and I use
> S3 for it with pool erasure code EC3+2 on it.
>
> I have a question, how many osd nodes can fail with erasure code 3+2 with
> cluster working normal (read, write)? and can i choose better erasure code
> ec7+3, 8+2 etc..?
>
> With the erasure code algorithm, it only ensures no data loss, but does
> not guarantee that the cluster operates normally and does not block IO when
> osd nodes down. Is that right?
>
> Thanks to the community.
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx