If I start to use all available space that pool can offer (4.5T) and
first OSD (2.7T) fails, I'm sure I'll end up with lost data since it's
not possible to fit 4.5T on 2 remaining drives with total raw capacity
of 3.6T.
I'm wondering why ceph isn't complaining now. I thought it should place
data among disks in that way, that loosing any OSD would keep data safe
for RO. (by wasting excessive 0.9T capacity on the first drive)
Oh, and here's my rule and profile - by mistake I've sent it on PM:
rule ceph3_ec_low_k2_m1-data {
id 2
type erasure
min_size 3
max_size 3
step set_chooseleaf_tries 5
step set_choose_tries 100
step take default class low_hdd
step choose indep 0 type osd
step emit
}
crush-device-class=low_hdd
crush-failure-domain=osd
crush-root=default
jerasure-per-chunk-alignment=false
k=2
m=1
plugin=jerasure
technique=reed_sol_van
w=8
Paweł
W dniu 8.11.2022 o 15:47, Danny Webb pisze:
with a m value of 1 if you lost a single OSD/failure domain you'd end up with a read only pg or cluster. usually you need at least k+1 to survive a failure domain failure depending on your min_size setting. The other thing you need to take into consideration is that the m value is for both failure domain *and* osd in an unlucky scenario (eg, you had a pg that happened to be on a downed host and a failed OSD elsewhere in the cluster). For a 3 OSD configuration the minimum fault tolerant setup would be k=1, m=2 and you effectively then are doing replica 3 anyways. At least this is my understanding of it. Hope that helps
________________________________
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx