pre6 hanging problems

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi all -

I have client and server set up with the pre6 version of gluserfs. Several
times a day the client mount will freeze up as does any command that tries
to read from the mountpoint. I have to kill the glusterfs process, unmount
the directory and remount it to get it to work again.

When this happens, there is another glusterfs client on other machines
connected to the same server that does not get disconnected. So the timeout
message in the logs is confusing to me. If it's really timing out wouldn't
the other server be disconnected, too?

This is on CentOS 5 with fuse 2.7.0-glfs.

When it happens, here's what shows up in the client:

...
2007-07-25 09:45:59 D [inode.c:327:__active_inode] fuse/inode: activating
inode(4210807), lru=0/1024
2007-07-25 09:45:59 D [inode.c:285:__destroy_inode] fuse/inode: destroy
inode(4210807)
2007-07-25 12:37:26 W [client-protocol.c:211:call_bail] brick: activating
bail-out. pending frames = 1. last sent =
2007-07-25 12:33:42. last received = 2007-07-25 11:42:59 transport-timeout =
120
2007-07-25 12:37:26 C [client-protocol.c:219:call_bail] brick: bailing
transport
2007-07-25 12:37:26 W [client-protocol.c:4189:client_protocol_cleanup]
brick: cleaning up state in transport object
0x80a03d0
2007-07-25 12:37:26 W [client-protocol.c:4238:client_protocol_cleanup]
brick: forced unwinding frame type(0) op(15)
2007-07-25 12:37:26 C [tcp.c:81:tcp_disconnect] brick: connection
disconnected

When it happens, here's what shows up in the server:

2007-07-25 15:37:40 E [protocol.c:346:gf_block_unserialize_transport]
libglusterfs/protocol: full_read of block failed: peer (
192.168.2.3:1023)
2007-07-25 15:37:40 C [tcp.c:81:tcp_disconnect] server: connection
disconnected
2007-07-25 15:37:40 E [protocol.c:251:gf_block_unserialize_transport]
libglusterfs/protocol: EOF from peer (192.168.2.4:1023)
2007-07-25 15:37:40 C [tcp.c:81:tcp_disconnect] server: connection
disconnected

And here's the client backtrace:

(gdb) bt
#0  0x0032e7a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
#1  0x005a3824 in raise () from /lib/tls/libpthread.so.0
#2  0x00655b0c in tcp_bail (this=0x80a03d0) at
../../../../transport/tcp/tcp.c:146
#3  0x00695bbc in transport_bail (this=0x80a03d0) at transport.c:192
#4  0x00603a16 in call_bail (trans=0x80a03d0) at client-protocol.c:220
#5  0x00696870 in gf_timer_proc (ctx=0xbffeec30) at timer.c:119
#6  0x0059d3cc in start_thread () from /lib/tls/libpthread.so.0
#7  0x00414c3e in clone () from /lib/tls/libc.so.6


client config:

### Add client feature and attach to remote subvolume
volume brick
  type protocol/client
  option transport-type tcp/client     # for TCP/IP transport
  option remote-host 192.168.2.5       # IP address of the remote brick
  option remote-subvolume brick_1  # name of the remote volume
end-volume

# #### Add writeback feature
 volume brick-wb
   type performance/write-behind
   option aggregate-size 131072 # unit in bytes
   subvolumes brick
 end-volume

server config:

### Export volume "brick" with the contents of "/home/export" directory.
volume brick_1
  type storage/posix
  option directory /home/vg_3ware1/vivalog/brick_1
end-volume

volume brick_2
  type storage/posix
  option directory /home/vg_3ware1/vivalog/brick_2
end-volume

### Add network serving capability to above brick.
volume server
  type protocol/server
  option transport-type tcp/server     # For TCP/IP transport
  option bind-address 192.168.2.5     # Default is to listen on all
interfaces
  subvolumes brick_1
  option auth.ip.brick_2.allow * # Allow access to "brick" volume
  option auth.ip.brick_1.allow * # Allow access to "brick" volume
end-volume

ps I have one server serving two volume bricks to two physically distinct
clients.  I assume this is okay--that I don't need to have two separate
server declarations.


[Index of Archives]     [Gluster Users]     [Ceph Users]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux