hi, here is the log: gdb glusterfs GNU gdb (GDB; SUSE Linux Enterprise 11) 6.8.50.20081120-cvs Copyright (C) 2008 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-suse-linux". For bug reporting instructions, please see: <http://bugs.opensuse.org/>... (gdb) r --debug Starting program: /usr/sbin/glusterfs --debug [Thread debugging using libthread_db enabled] [2009-12-21 08:15:17] D [glusterfsd.c:424:_get_specfp] glusterfs: loading volume file /etc/glusterfs/glusterfs.vol [2009-12-21 08:15:17] D [xlator.c:739:xlator_set_type] xlator: dlsym(notify) on /usr/lib64/glusterfs/3.0.0/xlator/features/locks.so: undefined symbol: notify -- neglecting [2009-12-21 08:15:17] D [xlator.c:739:xlator_set_type] xlator: dlsym(notify) on /usr/lib64/glusterfs/3.0.0/xlator/performance/write-behind.so: undefined symbol: notify -- neglecting [2009-12-21 08:15:17] D [xlator.c:739:xlator_set_type] xlator: dlsym(notify) on /usr/lib64/glusterfs/3.0.0/xlator/performance/io-threads.so: undefined symbol: notify -- neglecting [2009-12-21 08:15:17] D [xlator.c:744:xlator_set_type] xlator: dlsym(dumpops) on /usr/lib64/glusterfs/3.0.0/xlator/performance/io-threads.so: undefined symbol: dumpops -- neglecting [2009-12-21 08:15:17] D [xlator.c:739:xlator_set_type] xlator: dlsym(notify) on /usr/lib64/glusterfs/3.0.0/xlator/performance/quick-read.so: undefined symbol: notify -- neglecting ================================================================================ Version : glusterfs 3.0.0 built on Dec 9 2009 10:05:41 git: 2.0.1-886-g8379edd Starting Time: 2009-12-21 08:15:17 Command line : /usr/sbin/glusterfs --debug PID : 7472 System name : Linux Nodename : gfs-01-02 Kernel Release : 2.6.27.19-5-default Hardware Identifier: x86_64 Given volfile: +------------------------------------------------------------------------------+ 1: # export-office-data02-server_repl 2: # gfs-01-01 /GFS/office-data02 3: # gfs-01-02 /GFS/office-data02 4: 5: volume posix 6: type storage/posix 7: option directory /GFS/office-data02 8: end-volume 9: 10: volume locks 11: type features/locks 12: subvolumes posix 13: end-volume 14: 15: volume posix-remote 16: type protocol/client 17: option transport-type tcp 18: option ping-timeout 5 19: option remote-host gfs-01-01 20: option remote-port 7000 21: option remote-subvolume locks 22: end-volume 23: 24: volume gfs-replicate 25: type cluster/replicate 26: subvolumes posix-remote 27: end-volume 28: 29: volume writebehind 30: type performance/write-behind 31: option cache-size 2MB 32: option flush-behind on 33: subvolumes gfs-replicate 34: end-volume 35: 36: volume office-data02 37: type performance/io-threads 38: option thread-count 32 # default is 16 39: subvolumes writebehind 40: end-volume 41: 42: volume quickread 43: type performance/quick-read 44: option cache-timeout 1 45: option max-file-size 512kB 46: # subvolumes web-data 47: subvolumes office-data02 48: end-volume 49: 50: volume server 51: type protocol/server 52: option transport-type tcp 53: option transport.socket.listen-port 7000 54: option auth.addr.office-data02.allow 192.168.11.* 55: option auth.addr.locks.allow 192.168.11.* 56: subvolumes office-data02 locks 57: end-volume +------------------------------------------------------------------------------+ [2009-12-21 08:15:17] D [glusterfsd.c:1335:main] glusterfs: running in pid 7472 [New Thread 0x7ffff5b45950 (LWP 7475)] [2009-12-21 08:15:17] D [transport.c:145:transport_load] transport: attempt to load file /usr/lib64/glusterfs/3.0.0/transport/socket.so [2009-12-21 08:15:17] D [xlator.c:284:_volume_option_value_validate] server: no range check required for 'option transport.socket.listen-port 7000' [2009-12-21 08:15:17] W [quick-read.c:2187:init] quickread: dangling volume. check volfile [2009-12-21 08:15:17] D [io-threads.c:2841:init] office-data02: io-threads: Autoscaling: off, min_threads: 32, max_threads: 32 [New Thread 0x7ffff7f85950 (LWP 7476)] [New Thread 0x7ffff4f36950 (LWP 7477)] [New Thread 0x7ffff4e35950 (LWP 7478)] [New Thread 0x7ffff4d34950 (LWP 7479)] [New Thread 0x7ffff4c33950 (LWP 7480)] [New Thread 0x7ffff4b32950 (LWP 7481)] [New Thread 0x7ffff4a31950 (LWP 7482)] [New Thread 0x7ffff4930950 (LWP 7483)] [New Thread 0x7ffff482f950 (LWP 7484)] [New Thread 0x7ffff472e950 (LWP 7485)] [New Thread 0x7ffff462d950 (LWP 7486)] [New Thread 0x7ffff452c950 (LWP 7487)] [New Thread 0x7ffff442b950 (LWP 7488)] [New Thread 0x7ffff432a950 (LWP 7489)] [New Thread 0x7ffff4229950 (LWP 7490)] [New Thread 0x7ffff4128950 (LWP 7491)] [New Thread 0x7ffff4027950 (LWP 7492)] [New Thread 0x7ffff3f26950 (LWP 7493)] [New Thread 0x7ffff3e25950 (LWP 7494)] [New Thread 0x7ffff3d24950 (LWP 7495)] [New Thread 0x7ffff3c23950 (LWP 7496)] [New Thread 0x7ffff3b22950 (LWP 7497)] [New Thread 0x7ffff3a21950 (LWP 7498)] [New Thread 0x7ffff3920950 (LWP 7499)] [New Thread 0x7ffff381f950 (LWP 7500)] [New Thread 0x7ffff371e950 (LWP 7501)] [New Thread 0x7ffff361d950 (LWP 7502)] [New Thread 0x7ffff351c950 (LWP 7503)] [New Thread 0x7ffff341b950 (LWP 7504)] [New Thread 0x7ffff331a950 (LWP 7505)] [New Thread 0x7ffff3219950 (LWP 7506)] [New Thread 0x7ffff3118950 (LWP 7507)] [2009-12-21 08:15:17] D [write-behind.c:2480:init] writebehind: disabling write-behind for first 1 bytes [2009-12-21 08:15:17] D [write-behind.c:2531:init] writebehind: enabling flush-behind [2009-12-21 08:15:17] D [client-protocol.c:6581:init] posix-remote: defaulting frame-timeout to 30mins [2009-12-21 08:15:17] D [client-protocol.c:6589:init] posix-remote: setting ping-timeout to 5 [2009-12-21 08:15:17] D [transport.c:145:transport_load] transport: attempt to load file /usr/lib64/glusterfs/3.0.0/transport/socket.so [2009-12-21 08:15:17] D [xlator.c:284:_volume_option_value_validate] posix-remote: no range check required for 'option remote-port 7000' [2009-12-21 08:15:17] D [transport.c:145:transport_load] transport: attempt to load file /usr/lib64/glusterfs/3.0.0/transport/socket.so [2009-12-21 08:15:17] D [xlator.c:284:_volume_option_value_validate] posix-remote: no range check required for 'option remote-port 7000' [2009-12-21 08:15:17] D [client-protocol.c:7005:notify] posix-remote: got GF_EVENT_PARENT_UP, attempting connect on transport [2009-12-21 08:15:17] D [client-protocol.c:7005:notify] posix-remote: got GF_EVENT_PARENT_UP, attempting connect on transport [2009-12-21 08:15:17] D [client-protocol.c:7005:notify] posix-remote: got GF_EVENT_PARENT_UP, attempting connect on transport [2009-12-21 08:15:17] D [client-protocol.c:7005:notify] posix-remote: got GF_EVENT_PARENT_UP, attempting connect on transport [2009-12-21 08:15:17] N [glusterfsd.c:1361:main] glusterfs: Successfully started [2009-12-21 08:15:17] D [client-protocol.c:7019:notify] posix-remote: got GF_EVENT_CHILD_UP [2009-12-21 08:15:17] D [client-protocol.c:7019:notify] posix-remote: got GF_EVENT_CHILD_UP [2009-12-21 08:15:17] N [client-protocol.c:6224:client_setvolume_cbk] posix-remote: Connected to 192.168.11.11:7000, attached to remote volume 'locks'. [2009-12-21 08:15:17] N [afr.c:2625:notify] gfs-replicate: Subvolume 'posix-remote' came back up; going online. [2009-12-21 08:15:17] N [client-protocol.c:6224:client_setvolume_cbk] posix-remote: Connected to 192.168.11.11:7000, attached to remote volume 'locks'. [2009-12-21 08:15:17] N [afr.c:2625:notify] gfs-replicate: Subvolume 'posix-remote' came back up; going online. [2009-12-21 08:15:28] D [addr.c:190:gf_auth] locks: allowed = "192.168.11.*", received addr = "192.168.11.11" [2009-12-21 08:15:28] N [server-protocol.c:5809:mop_setvolume] server: accepted client from 192.168.11.11:1023 [2009-12-21 08:15:28] D [addr.c:190:gf_auth] locks: allowed = "192.168.11.*", received addr = "192.168.11.11" [2009-12-21 08:15:28] N [server-protocol.c:5809:mop_setvolume] server: accepted client from 192.168.11.11:1022 [2009-12-21 08:16:32] D [addr.c:190:gf_auth] office-data02: allowed = "192.168.11.*", received addr = "192.168.11.68" [2009-12-21 08:16:32] N [server-protocol.c:5809:mop_setvolume] server: accepted client from 192.168.11.68:1023 [2009-12-21 08:16:32] D [addr.c:190:gf_auth] office-data02: allowed = "192.168.11.*", received addr = "192.168.11.68" [2009-12-21 08:16:32] N [server-protocol.c:5809:mop_setvolume] server: accepted client from 192.168.11.68:1022 [2009-12-21 09:25:11] E [client-protocol.c:415:client_ping_timer_expired] posix-remote: Server 192.168.11.11:7000 has not responded in the last 5 seconds, disconnecting. [2009-12-21 09:25:11] E [saved-frames.c:165:saved_frames_unwind] posix-remote: forced unwinding frame type(1) op(LOOKUP) [2009-12-21 09:25:11] E [saved-frames.c:165:saved_frames_unwind] posix-remote: forced unwinding frame type(2) op(PING) [2009-12-21 09:25:11] D [client-protocol.c:516:client_ping_cbk] posix-remote: timer must have expired [2009-12-21 09:25:11] N [client-protocol.c:6972:notify] posix-remote: disconnected [2009-12-21 09:25:11] E [afr.c:2655:notify] gfs-replicate: All subvolumes are down. Going offline until atleast one of them comes back up. [2009-12-21 09:25:11] D [server-protocol.c:2468:server_lookup_cbk] server: 16: LOOKUP / (1) ==> -1 (Transport endpoint is not connected) [2009-12-21 09:25:35] E [socket.c:760:socket_connect_finish] posix-remote: connection to 192.168.11.11:7000 failed (No route to host) [2009-12-21 09:25:35] E [socket.c:760:socket_connect_finish] posix-remote: connection to 192.168.11.11:7000 failed (No route to host) [2009-12-21 09:26:14] D [client-protocol.c:7019:notify] posix-remote: got GF_EVENT_CHILD_UP [2009-12-21 09:26:14] D [client-protocol.c:7019:notify] posix-remote: got GF_EVENT_CHILD_UP [2009-12-21 09:26:14] N [client-protocol.c:6224:client_setvolume_cbk] posix-remote: Connected to 192.168.11.11:7000, attached to remote volume 'locks'. [2009-12-21 09:26:14] N [afr.c:2625:notify] gfs-replicate: Subvolume 'posix-remote' came back up; going online. [2009-12-21 09:26:14] N [client-protocol.c:6224:client_setvolume_cbk] posix-remote: Connected to 192.168.11.11:7000, attached to remote volume 'locks'. [2009-12-21 09:26:14] N [afr.c:2625:notify] gfs-replicate: Subvolume 'posix-remote' came back up; going online. hmm... [2009-12-21 09:25:11] E [afr.c:2655:notify] gfs-replicate: All subvolumes are down. Going offline until atleast one of them comes back up. [2009-12-21 09:25:11] D [server-protocol.c:2468:server_lookup_cbk] server: 16: LOOKUP / (1) ==> -1 (Transport endpoint is not connected) why are all subvolumes down, if i do a reboot on gfs-01-01? the subvolume should be online because gfs-01-02 is online. on Client: ls: cannot open directory .: Transport endpoint is not connected is there something wrong with vol file on server or gluster? regards roland