Re: Increase timeout for marking osd down

Eugen Block <eblock@xxxxxx> · Wed, 26 Apr 2023 08:57:57 +0000

Hi,

I don't think increasing the mon_osd_down_out_interval timeout alone  
will really help you in this situation, I remember an older thread  
about that but couldn't find it. What you could test is setting the  
nodown flag (ceph osd set nodown) to prevent flapping OSDs, but that's  
not a real solution. But depending on how often you experience  
recovery IO the nodown flag could probably make things better until  
recovery has finished.

Zitat von Nicola Mori <nicolamori@xxxxxxx>:

Dear Ceph users,

my cluster is made of very old machines on a Gbit ethernet. I see  
that sometimes some OSDs are marked down due to slow networking,  
especially on heavy network load like during recovery. This causes  
problems, for example PGs keeps being deactivated and activated as  
the OSDs are marked down and up (at least to my best understanding).  
So I'd need to know if there is some way to increase the timeout  
after which an OSD is marked down, to cope with my slow network.
Thanks,

Nicola
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx