Hi Patrick, Yes K and M are chunks, but the default crush map is a chunk per host, which is probably the best way to do it, but I'm no expert. I'm not sure why you would want to do a crush map with 2 chunks per host and min size 4 as it' s just asking for trouble at some point, in my opinion. Anyway, take a look at this post if your interested in doing 2 chunks per host it will give you an idea of crushmap setup, https://lists.ceph.io/hyperkitty/list/ceph-users@xxxxxxx/thread/NB3M22GNAC7VNWW7YBVYTH6TBZOYLTWA/ . Regards, Curt On Wed, Dec 6, 2023 at 6:26 PM Patrick Begou < Patrick.Begou@xxxxxxxxxxxxxxxxxxxxxx> wrote: > Le 06/12/2023 à 00:11, Rich Freeman a écrit : > > On Tue, Dec 5, 2023 at 6:35 AM Patrick Begou > > <Patrick.Begou@xxxxxxxxxxxxxxxxxxxxxx> wrote: > >> Ok, so I've misunderstood the meaning of failure domain. If there is no > >> way to request using 2 osd/node and node as failure domain, with 5 nodes > >> k=3+m=1 is not secure enough and I will have to use k=2+m=2, so like a > >> raid1 setup. A little bit better than replication in the point of view > >> of global storage capacity. > >> > > I'm not sure what you mean by requesting 2osd/node. If the failure > > domain is set to the host, then by default k/m refer to hosts, and the > > PGs will be spread across all OSDs on all hosts, but with any > > particular PG only being present on one OSD on each host. You can get > > fancy with device classes and crush rules and such and be more > > specific with how they're allocated, but that would be the typical > > behavior. > > > > Since k/m refer to hosts, then k+m must be less than or equal to the > > number of hosts or you'll have a degraded pool because there won't be > > enough hosts to allocate them all. It won't ever stack them across > > multiple OSDs on the same host with that configuration. > > > > k=2,m=2 with min=3 would require at least 4 hosts (k+m), and would > > allow you to operate degraded with a single host down, and the PGs > > would become inactive but would still be recoverable with two hosts > > down. While strictly speaking only 4 hosts are required, you'd do > > better to have more than that since then the cluster can immediately > > recover from a loss, assuming you have sufficient space. As you say > > it is no more space-efficient than RAID1 or size=2, and it suffers > > write amplification for modifications, but it does allow recovery > > after the loss of up to two hosts, and you can operate degraded with > > one host down which allows for somewhat high availability. > > > Hi Rich, > > My understood was that k and m were for EC chunks not hosts. 🙁 Of > course if k and m are hosts the best choice would be k=2 and m=2. > > When Christian wrote: > /For example if you run an EC=4+2 profile on 3 hosts you can structure > your crushmap so that you have 2 chunks per host. This means even if one > host is down you are still guaranteed to have 4 chunks available./ > > This is that I had thought before (and using 5 nodes instead of 3 as the > Christian's example). But it does not match what you explain if k and m > are nodes. > > I'm a little bit confused with crushmap settings. > > Patrick > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx