Re: ceph IO are interrupted when OSD goes down

Eugen Block <eblock@xxxxxx> · Mon, 18 Oct 2021 10:12:15 +0000

Hi,

with this EC setup your pool min_size would be 11 (k+1), so in case  
one host goes down (or several OSDs fail on this host), your clients  
should not be affected. But as soon as a second host fails you’ll  
notice IO pause until at least one host has recovered. Do you have  
more than 12 hosts in this cluster so it could recover one host failure?

Regards,
Eugen

Zitat von Denis Polom <denispolom@xxxxxxxxx>:

Hi,

I have a EC pool with these settings:

crush-device-class= crush-failure-domain=host crush-root=default  
jerasure-per-chunk-alignment=false k=10 m=2 plugin=jerasure  
technique=reed_sol_van w=8

and my understanding is if some of the OSDs goes down because of  
read error or just flapping due to some reason (mostly read errors ,  
bad sectors in my case) clients IO shouldn't be disturbed because we  
have other object replicas and Ceph sould manage it. But clients IOs  
are disturbed, cephfs mount point gets inaccessible on clients even  
if they are mounting cephfs against all 3 monitors.

It's not happening always just sometimes. Is it right understanding  
that it can happen if read error or flapping occures  on active OSD?

Thx!

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx