Re: EC Profiles & DR

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Le 06/12/2023 à 00:11, Rich Freeman a écrit :
On Tue, Dec 5, 2023 at 6:35 AM Patrick Begou
<Patrick.Begou@xxxxxxxxxxxxxxxxxxxxxx>  wrote:
Ok, so I've misunderstood the meaning of failure domain. If there is no
way to request using 2 osd/node and node as failure domain, with 5 nodes
k=3+m=1 is not secure enough and I will have to use k=2+m=2, so like a
raid1  setup. A little bit better than replication in the point of view
of global storage capacity.

I'm not sure what you mean by requesting 2osd/node.  If the failure
domain is set to the host, then by default k/m refer to hosts, and the
PGs will be spread across all OSDs on all hosts, but with any
particular PG only being present on one OSD on each host.  You can get
fancy with device classes and crush rules and such and be more
specific with how they're allocated, but that would be the typical
behavior.

Since k/m refer to hosts, then k+m must be less than or equal to the
number of hosts or you'll have a degraded pool because there won't be
enough hosts to allocate them all.  It won't ever stack them across
multiple OSDs on the same host with that configuration.

k=2,m=2 with min=3 would require at least 4 hosts (k+m), and would
allow you to operate degraded with a single host down, and the PGs
would become inactive but would still be recoverable with two hosts
down.  While strictly speaking only 4 hosts are required, you'd do
better to have more than that since then the cluster can immediately
recover from a loss, assuming you have sufficient space.  As you say
it is no more space-efficient than RAID1 or size=2, and it suffers
write amplification for modifications, but it does allow recovery
after the loss of up to two hosts, and you can operate degraded with
one host down which allows for somewhat high availability.

Hi Rich,

My understood was that k and m were for EC chunks not hosts. 🙁 Of course if k and m are hosts the best choice would be k=2 and m=2.

When Christian wrote:
/For example if you run an EC=4+2 profile on 3 hosts you can structure your crushmap so that you have 2 chunks per host. This means even if one host is down you are still guaranteed to have 4 chunks available./

This is that I had thought before (and using 5 nodes instead of 3 as the Christian's example). But it does not match what you explain if k and m are nodes.

I'm a little bit confused with crushmap settings.

Patrick
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux