Hi all - I have client and server set up with the pre6 version of gluserfs. Several times a day the client mount will freeze up as does any command that tries to read from the mountpoint. I have to kill the glusterfs process, unmount the directory and remount it to get it to work again. When this happens, there is another glusterfs client on other machines connected to the same server that does not get disconnected. So the timeout message in the logs is confusing to me. If it's really timing out wouldn't the other server be disconnected, too? This is on CentOS 5 with fuse 2.7.0-glfs. When it happens, here's what shows up in the client: ... 2007-07-25 09:45:59 D [inode.c:327:__active_inode] fuse/inode: activating inode(4210807), lru=0/1024 2007-07-25 09:45:59 D [inode.c:285:__destroy_inode] fuse/inode: destroy inode(4210807) 2007-07-25 12:37:26 W [client-protocol.c:211:call_bail] brick: activating bail-out. pending frames = 1. last sent = 2007-07-25 12:33:42. last received = 2007-07-25 11:42:59 transport-timeout = 120 2007-07-25 12:37:26 C [client-protocol.c:219:call_bail] brick: bailing transport 2007-07-25 12:37:26 W [client-protocol.c:4189:client_protocol_cleanup] brick: cleaning up state in transport object 0x80a03d0 2007-07-25 12:37:26 W [client-protocol.c:4238:client_protocol_cleanup] brick: forced unwinding frame type(0) op(15) 2007-07-25 12:37:26 C [tcp.c:81:tcp_disconnect] brick: connection disconnected When it happens, here's what shows up in the server: 2007-07-25 15:37:40 E [protocol.c:346:gf_block_unserialize_transport] libglusterfs/protocol: full_read of block failed: peer ( 192.168.2.3:1023) 2007-07-25 15:37:40 C [tcp.c:81:tcp_disconnect] server: connection disconnected 2007-07-25 15:37:40 E [protocol.c:251:gf_block_unserialize_transport] libglusterfs/protocol: EOF from peer (192.168.2.4:1023) 2007-07-25 15:37:40 C [tcp.c:81:tcp_disconnect] server: connection disconnected And here's the client backtrace: (gdb) bt #0 0x0032e7a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2 #1 0x005a3824 in raise () from /lib/tls/libpthread.so.0 #2 0x00655b0c in tcp_bail (this=0x80a03d0) at ../../../../transport/tcp/tcp.c:146 #3 0x00695bbc in transport_bail (this=0x80a03d0) at transport.c:192 #4 0x00603a16 in call_bail (trans=0x80a03d0) at client-protocol.c:220 #5 0x00696870 in gf_timer_proc (ctx=0xbffeec30) at timer.c:119 #6 0x0059d3cc in start_thread () from /lib/tls/libpthread.so.0 #7 0x00414c3e in clone () from /lib/tls/libc.so.6 client config: ### Add client feature and attach to remote subvolume volume brick type protocol/client option transport-type tcp/client # for TCP/IP transport option remote-host 192.168.2.5 # IP address of the remote brick option remote-subvolume brick_1 # name of the remote volume end-volume # #### Add writeback feature volume brick-wb type performance/write-behind option aggregate-size 131072 # unit in bytes subvolumes brick end-volume server config: ### Export volume "brick" with the contents of "/home/export" directory. volume brick_1 type storage/posix option directory /home/vg_3ware1/vivalog/brick_1 end-volume volume brick_2 type storage/posix option directory /home/vg_3ware1/vivalog/brick_2 end-volume ### Add network serving capability to above brick. volume server type protocol/server option transport-type tcp/server # For TCP/IP transport option bind-address 192.168.2.5 # Default is to listen on all interfaces subvolumes brick_1 option auth.ip.brick_2.allow * # Allow access to "brick" volume option auth.ip.brick_1.allow * # Allow access to "brick" volume end-volume ps I have one server serving two volume bricks to two physically distinct clients. I assume this is okay--that I don't need to have two separate server declarations.