Re: client cannot reconnect

Joe Julian <joe@xxxxxxxxxxxxxxxx> · Wed, 12 Sep 2012 18:54:16 -0700

I had the same thing, though I also couldn't tell you which of my many 
up-down sequences caused it, but I ended up in a state where the client 
was connected to two out of three servers and there were no attempts in 
the log to try to reconnect. I finally changed the graph which triggered 
the client(s) to reconnect to the third server.

On 09/12/2012 06:53 PM, Emmanuel Dreyfus wrote:
Anand Avati <anand.avati@xxxxxxxxx> wrote:

Can you give some more background on what you did to reproduce this, and
also logs from the server?
It is the first time I get this issue, and I am not sure I can
reproduce. This is a 2x2 replicated/distributed volume using 4 bricks on
3 servers.

The bricks logs:
http://ftp.espci.fr/shadow/manu/gluster-enotconn-1.log
http://ftp.espci.fr/shadow/manu/gluster-enotconn-2.log
http://ftp.espci.fr/shadow/manu/gluster-enotconn-3.log
http://ftp.espci.fr/shadow/manu/gluster-enotconn-4.log

My stress test has always been the same:
Fetch theses tarballs:
ftp://ftp.netbsd.org/pub/NetBSD/NetBSD-5.1.2/source/sets/src.tgz
ftp://ftp.netbsd.org/pub/NetBSD/NetBSD-5.1.2/source/sets/sharesrc.tgz
ftp://ftp.netbsd.org/pub/NetBSD/NetBSD-5.1.2/source/sets/gnusrc.tgz
ftp://ftp.netbsd.org/pub/NetBSD/NetBSD-5.1.2/source/sets/syssrc.tgz
unpack, then
cd usr/src && ./build.sh -Uom mac68k release

I built mac68k this time, but i386, amd64 or whatever on previous times,
I am not sure it is relevant. The test is interesting because it opens a
lot of files and do many things on them;