fail-over taking too long when a node reboots

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



hi,
     Does anyone have complete understanding of keepalive timeout vs TCP User timeout (UTO) options? For both afr and EC when the server reboots it takes 42 seconds for the fops to fail with ENOTCONN (saved_frames_unwind()). I am wondering if there is any way to reduce this time by playing with these two options. As per our earlier research on this (I think it was kp who did that) keepalive was not getting triggered when there are fops in progress and he saw quite a few game-dev forums talk about this problem too. It seems like there is a new timeout called TCP User timeout which seems to address this. I am wondering if anyone of you have any experience with this and suggest defaults to be changed for these timeouts which are more meaningful. I think at the moment default is 42 seconds.

--
Pranith
_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel

[Index of Archives]     [Gluster Users]     [Ceph Users]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux