default reconnect timeout?

carl.boberg at memnonnetworks.com (Carl Boberg) · Tue, 10 Jan 2012 15:37:53 +0100



On Tue, Jan 10, 2012 at 15:11, Jeff White <jaw171 at pitt.edu> wrote:

> **
> On 01/10/2012 03:25 AM, Carl Boberg wrote:
>
> Hello,
>
>  I have started evaluating GlusterFS for our application to share data
> and everything seems very satisfactory so far :)
> I only have a few questions that are about the timeout that happens when
> one of the servers goes down.
>
>  We have a simple 2 server setup with 2 replicated volumes.
>
> You should post what version you are using and the output of 'gluster
> volume info'.
>
>
My rookie misstake :)

Volume Name: dev-msg-vol
Type: Replicate
Status: Started
Number of Bricks: 2
Transport-type: tcp
Bricks:
Brick1: gfs1:/glusterfs/dev-msg-vol/exp1
Brick2: gfs2:/glusterfs/dev-msg-vol/exp2

Volume Name: dev-attach-vol
Type: Replicate
Status: Started
Number of Bricks: 2
Transport-type: tcp
Bricks:
Brick1: gfs1:/glusterfs/dev-attach-vol/exp1
Brick2: gfs2:/glusterfs/dev-attach-vol/exp2


>  About 10 clients connecting to both of the volumes and filesizes range
> around a few bytes to about a coupple of megs. Its a lot of files and
> folders...
>
> Define 'a lot'.  I'm soon going to migrate 83 million files to Gluster and
> that's not 'a lot' compared to some others.
>
>
Sorry about that. We apparently dont have  that many files then :) About a
couple of million, thats all.

 When one of the servers goes down all clients seems to hang for about
> 40-50 seconds before being able to access files and folders again. Write
> operations continues during this but i guess it is being done in memory
> until the file is accessible again..
> What controles this timeout? Is it possible to adjust it? Is it advisable
> to adjust it, if you can?
>
> This might be what you are looking for but I don't know if it's advisable
> to change it.
>
>
> http://www.gluster.org/community/documentation/index.php/Gluster_3.2:_Setting_Volume_Options#network.ping-timeout
>
>
Ah. Should have seen that when I was going through the documentation....
Still it does not say anything about the effects of changing the timeout?
It says it is an "expensive" operation but why is 42 seconds the agreed
timeout? And how expensive? Expensive in what terms, cpu, i/o, network?
Im not expecting an answer for this but it is interesting to note that this
is an important setting concerning uptime functionallity during a server
failure in a replicated setup. If i have an intense read/write application
working on a gluster volume id be very interested in having this as short
as possible. As such I just wish there were more information regarding this
setting.

Thank you for the pointer to the documentation ( I should have found it for
myself ...)

Cheers
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gluster.org/pipermail/gluster-users/attachments/20120110/b3840765/attachment.htm>