Re: Re: Timeout settings and self-healing ? (WAS: HA failover test unsuccessful (inaccessible mountpoint))

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



My server configs:

http://glusterfs.pastebin.com/m3f82f264

One of the client config:
http://glusterfs.pastebin.com/d5df7fab

My problem is, when one of the storage servers is unplugged, I always get the

Transport endpoint is not connected message.



Krishna Srinivas wrote:
Guido,

Can you give the setup details, conf files?
you can use http://glusterfs.pastebin.com for pasting conf files.

Thanks
Krishna

On Fri, Apr 4, 2008 at 2:40 PM, Anand Avati <avati@xxxxxxxxxxxxx> wrote:
Daniel/Guido,
  can you paste the logs which are relevant from the time of unplugging the
 cable till the end of experiment?

 avati

 2008/4/3, Daniel Maher <dma+gluster@xxxxxxxxx <dma%2Bgluster@xxxxxxxxx>>:


> On Thu, 3 Apr 2008 14:55:48 +0530 "Anand Avati" <avati@xxxxxxxxxxxxx>
 > wrote:
 >
 > > Daniel,
 > >  maybe it is just taking long to detect connection failure. Can you
 > > try with 'option transport-timeout 20' (sets response timeout to 20
 > > seconds) in all your protocol/client and see if you still face the
 > > 'hang' ?
 >
 > My simple test case is as follows :
 > 1. Unplug one of the nodes (dfsD)
 > 2. Attempt to ls -l the /opt/ (in which gfs-mount/ - the mountpoint -
 > is contained)
 >
 > I set the timeout option along with every client instance in both the
 > client and server configs.  I tested timeout settings of 10 and 20
 > seconds (just to see).  In both cases, the 'hang' releases after a while
 > (approx 30 seconds), but the results are odd. For example :
 >
 > # ls -l
 >    (hang ~ 30 seconds)
 > ls: cannot access gfs-mount: Transport endpoint is not connected
 > total 0
 > d????????? ? ? ? ?                ? gfs-mount
 >
 > # ls -l
 >    (immediate)
 > ls: cannot access gfs-mount: Transport endpoint is not connected
 > total 0
 > d????????? ? ? ? ?                ? gfs-mount
 >
 >    (user wait ~ 5 seconds)
 >
 > # ls -l
 > total 8
 > drwxr-xr-x 2 root root 4096 2008-04-03 09:43 gfs-mount
 >
 > It would appear that the "recovery" time, regardless of whether the
 > timeout is set to 10 or 20, is around 35 to 40 seconds - though, at the
 > very least, it recovered.  Is there any reasonable way to bring this
 > period of time down ?
 >
 > Thank you all so much for your feedback on this topic !
 >
 >


_______________________________________________
 Gluster-devel mailing list
 Gluster-devel@xxxxxxxxxx
 http://lists.nongnu.org/mailman/listinfo/gluster-devel




--
Met vriendelijke groet,

Guido Smit
ComLog B.V.

Televisieweg 133
1322 BE Almere
T. 036 5470500
F. 036 5470481

No virus found in this outgoing message.
Checked by AVG.
Version: 7.5.524 / Virus Database: 269.23.3/1390 - Release Date: 4/21/2008 4:23 PM

[Index of Archives]     [Gluster Users]     [Ceph Users]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux