Re: gluster 3.0.0 catastrophic crash during basic file creation test

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello Daniel,

can you re-try to crash the setup after adding
	option ping-timeout 120
to your client config for every server.
I remember such crashes in 2.X versions with bonnie tests months ago.
Please tell about your experiences. Thanks.

Regards,
Stephan





On Thu, 04 Feb 2010 16:43:29 +0100
Daniel Maher <dma+gluster@xxxxxxxxx> wrote:

> 
> Hello,
> 
> I managed to crash Gluster 3.0.0 severely during a simple file creation 
> test.  Not only did the crash result in the standard « transport 
> endpoint not connected » problem, but the servers in question had to be 
> hard-reset in order to make them operational again.
> 
> So, here goes...
> 
> 4 nodes, two servers, two clients, client-side replication.  Clients are 
> Fedora 8, servers are Fedora 9.  Stock FUSE used throughout. 
> Configurations generated with the volgen tool using the following 
> commandline :
> 
> # glusterfs-volgen --name replicated --raid 1 s01:/opt/gluster 
> s02:/opt/gluster
> 
> Servers :
> # service glusterfsd start
> 
> Clients :
> # mount -t glusterfs /etc/glusterfs/replicated-tcp.vol /opt/gluster/
> 
> The following Python script was used to run the file creation test :
> http://nfsv4.bullopensource.org/tools/tests_tools/test_files.py
> 
> The Python script was edited only to point the target directory to the 
> Gluster mount.  Each client was told to use a different sub-directory 
> within the Gluster mount point.
> 
> This script was used in the context of a bash looping script, which is 
> as follows :
> #!/bin/bash
> LOOP=0
> while [ $LOOP -lt 1000 ]
> do
>      time ./test_files.py | tee -a go_test_files.log
>      cat ./test_files_orw | tee -a go_test_files.log
>      let LOOP=$LOOP+1
> done
> 
> « test_files_orw » is the file that test_files.py outputs to.  It is 
> over-written on each run (hence the redirect).
> 
> The script made it through 20 or so iterations before Gluster crashed. 
> The servers responded to ping requests, but no new SSH connections could 
> be made.  Existing sessions open via SSH were frozen.  On the local 
> console, keyboard interactions were still possible, but no new actions 
> could be taken.  The servers were hard-reset at this point.
> 
> I'll be happy to provide any further information as is deemed necessary 
> - just let me know.
> 
> 
> -- 
> Daniel Maher <dma+gluster AT witbe DOT net>
> 
> 
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel@xxxxxxxxxx
> http://lists.nongnu.org/mailman/listinfo/gluster-devel
> 






[Index of Archives]     [Gluster Users]     [Ceph Users]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux