Okay I think I can respond myself, the pool is created with a
default min_size of 3, so when one of the OSDs goes down, the pool
doenst perform any IO, manually changing the the pool min_size to
2 worked great.
El 24/10/2017 a las 10:13, Jorge
Pinilla López escribió:
I am testing erasure code pools and doing a rados test write to
try fault tolerace.
I have 3 Nodes with 1 OSD each, K=2 M=1.
While performing the write (rados bench -p replicate 100 write), I
stop one of the OSDs daemons (example osd.0), simulating a node
fail, and then the hole write stops and I can't write any data
anymore.
1 16 28 12 46.8121 48
1.01548 0.616034
2 16 40 24 47.3907 48
1.04219 0.923728
3 16 52 36 47.5889 48
0.593145 1.0038
4 16 68 52 51.6633 64
1.39638 1.08098
5 16 74 58 46.158 24
1.02699 1.10172
6 16 83 67 44.4711 36
3.01542 1.18012
7 16 95 79 44.9722 48
0.776493 1.24003
8 16 95 79 39.3681 0
- 1.24003
9 16 95 79 35.0061 0
- 1.24003
10 16 95 79 31.5144 0
- 1.24003
11 16 95 79 28.6561 0
- 1.24003
12 16 95 79 26.2732 0
- 1.24003
Its pretty clear where the OSD failed
On the other hand, using a replicated pool, the client (rados
test) doesnt even notice the OSD fail, which is awesome.
Is this a normal behaviour on EC pools?
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
--
Jorge Pinilla López
jorpilo@xxxxxxxxx
Estudiante de ingenieria informática
Becario del area de sistemas (SICUZ)
Universidad de Zaragoza
PGP-KeyID: A34331932EBC715A
|
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com