This volume is now being tested by my collegue for windows purposes.I will create new one on monday and will test with parameters you've sent me.2014-10-17 17:36 GMT+03:00 Pranith Kumar Karampuri <pkarampu@xxxxxxxxxx>:Roman,
Everything in the logs look okay to me, except the following profile number:
3.91 1255944.81 us 127.00 us 23397532.00 us 189 FSYNC
It seems that at least one of the fsyncs is taking almost 23 seconds to complete. According to all the data you gave till now, I feel this is the only thing I feel could have done it. To test this bit, could you turn off the following option using and try again?
gluster volume set <volname> cluster.ensure-durability off
Let me know what happened. I am extremely curious to here about it.
Pranith
On 10/17/2014 12:04 PM, Roman wrote:
mount
[2014-10-13 17:36:56.758654] I [glusterfsd.c:1959:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.5.2 (/usr/sbin/glusterfs --direct-io-mode=enable --fuse-mountopts=default_permissions,allow_other,max_read=131072 --volfile-server=stor1 --volfile-server=stor2 --volfile-id=HA-WIN-TT-1T --fuse-mountopts=default_permissions,allow_other,max_read=131072 /srv/nfs/HA-WIN-TT-1T)[2014-10-13 17:36:56.762162] I [socket.c:3561:socket_init] 0-glusterfs: SSL support is NOT enabled[2014-10-13 17:36:56.762223] I [socket.c:3576:socket_init] 0-glusterfs: using system polling thread[2014-10-13 17:36:56.766686] I [dht-shared.c:311:dht_init_regex] 0-HA-WIN-TT-1T-dht: using regex rsync-hash-regex = ^\.(.+)\.[^.]+$[2014-10-13 17:36:56.768887] I [socket.c:3561:socket_init] 0-HA-WIN-TT-1T-client-1: SSL support is NOT enabled[2014-10-13 17:36:56.768939] I [socket.c:3576:socket_init] 0-HA-WIN-TT-1T-client-1: using system polling thread[2014-10-13 17:36:56.769280] I [socket.c:3561:socket_init] 0-HA-WIN-TT-1T-client-0: SSL support is NOT enabled[2014-10-13 17:36:56.769294] I [socket.c:3576:socket_init] 0-HA-WIN-TT-1T-client-0: using system polling thread[2014-10-13 17:36:56.769336] I [client.c:2294:notify] 0-HA-WIN-TT-1T-client-0: parent translators are ready, attempting connect on transport[2014-10-13 17:36:56.769829] I [client.c:2294:notify] 0-HA-WIN-TT-1T-client-1: parent translators are ready, attempting connect on transportFinal graph:+------------------------------------------------------------------------------+1: volume HA-WIN-TT-1T-client-02: type protocol/client3: option remote-host stor14: option remote-subvolume /exports/NFS-WIN/1T5: option transport-type socket6: option ping-timeout 107: option send-gids true8: end-volume9:10: volume HA-WIN-TT-1T-client-111: type protocol/client12: option remote-host stor213: option remote-subvolume /exports/NFS-WIN/1T14: option transport-type socket15: option ping-timeout 1016: option send-gids true17: end-volume18:19: volume HA-WIN-TT-1T-replicate-020: type cluster/replicate21: subvolumes HA-WIN-TT-1T-client-0 HA-WIN-TT-1T-client-122: end-volume23:24: volume HA-WIN-TT-1T-dht25: type cluster/distribute26: subvolumes HA-WIN-TT-1T-replicate-027: end-volume28:29: volume HA-WIN-TT-1T-write-behind30: type performance/write-behind31: subvolumes HA-WIN-TT-1T-dht32: end-volume33:34: volume HA-WIN-TT-1T-read-ahead35: type performance/read-ahead36: subvolumes HA-WIN-TT-1T-write-behind37: end-volume38:39: volume HA-WIN-TT-1T-io-cache40: type performance/io-cache41: subvolumes HA-WIN-TT-1T-read-ahead42: end-volume43:44: volume HA-WIN-TT-1T-quick-read45: type performance/quick-read46: subvolumes HA-WIN-TT-1T-io-cache47: end-volume48:49: volume HA-WIN-TT-1T-open-behind50: type performance/open-behind51: subvolumes HA-WIN-TT-1T-quick-read52: end-volume53:54: volume HA-WIN-TT-1T-md-cache55: type performance/md-cache56: subvolumes HA-WIN-TT-1T-open-behind57: end-volume58:59: volume HA-WIN-TT-1T60: type debug/io-stats61: option latency-measurement off62: option count-fop-hits off63: subvolumes HA-WIN-TT-1T-md-cache64: end-volume65:+------------------------------------------------------------------------------+[2014-10-13 17:36:56.770718] I [rpc-clnt.c:1729:rpc_clnt_reconfig] 0-HA-WIN-TT-1T-client-1: changing port to 49160 (from 0)[2014-10-13 17:36:56.771378] I [rpc-clnt.c:1729:rpc_clnt_reconfig] 0-HA-WIN-TT-1T-client-0: changing port to 49160 (from 0)[2014-10-13 17:36:56.772008] I [client-handshake.c:1677:select_server_supported_programs] 0-HA-WIN-TT-1T-client-1: Using Program GlusterFS 3.3, Num (1298437), Version (330)[2014-10-13 17:36:56.772083] I [client-handshake.c:1677:select_server_supported_programs] 0-HA-WIN-TT-1T-client-0: Using Program GlusterFS 3.3, Num (1298437), Version (330)[2014-10-13 17:36:56.772338] I [client-handshake.c:1462:client_setvolume_cbk] 0-HA-WIN-TT-1T-client-1: Connected to 10.250.0.2:49160, attached to remote volume '/exports/NFS-WIN/1T'.[2014-10-13 17:36:56.772361] I [client-handshake.c:1474:client_setvolume_cbk] 0-HA-WIN-TT-1T-client-1: Server and Client lk-version numbers are not same, reopening the fds[2014-10-13 17:36:56.772424] I [afr-common.c:4131:afr_notify] 0-HA-WIN-TT-1T-replicate-0: Subvolume 'HA-WIN-TT-1T-client-1' came back up; going online.[2014-10-13 17:36:56.772463] I [client-handshake.c:1462:client_setvolume_cbk] 0-HA-WIN-TT-1T-client-0: Connected to 10.250.0.1:49160, attached to remote volume '/exports/NFS-WIN/1T'.[2014-10-13 17:36:56.772477] I [client-handshake.c:1474:client_setvolume_cbk] 0-HA-WIN-TT-1T-client-0: Server and Client lk-version numbers are not same, reopening the fds[2014-10-13 17:36:56.779099] I [fuse-bridge.c:4977:fuse_graph_setup] 0-fuse: switched to graph 0[2014-10-13 17:36:56.779338] I [client-handshake.c:450:client_set_lk_version_cbk] 0-HA-WIN-TT-1T-client-0: Server lk version = 1[2014-10-13 17:36:56.779367] I [fuse-bridge.c:3914:fuse_init] 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.22 kernel 7.17[2014-10-13 17:36:56.779438] I [client-handshake.c:450:client_set_lk_version_cbk] 0-HA-WIN-TT-1T-client-1: Server lk version = 1[2014-10-13 17:37:02.010942] I [fuse-bridge.c:4818:fuse_thread_proc] 0-fuse: unmounting /srv/nfs/HA-WIN-TT-1T[2014-10-13 17:37:02.011296] W [glusterfsd.c:1095:cleanup_and_exit] (-->/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7fc7b7672e6d] (-->/lib/x86_64-linux-gnu/libpthread.so.0(+0x6b50) [0x7fc7b7d20b50] (-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xd5) [0x7fc7b95add55]))) 0-: received signum (15), shutting down[2014-10-13 17:37:02.011316] I [fuse-bridge.c:5475:fini] 0-fuse: Unmounting '/srv/nfs/HA-WIN-TT-1T'.[2014-10-13 17:37:31.133036] W [socket.c:522:__socket_rwv] 0-HA-WIN-TT-1T-client-0: readv on 10.250.0.1:49160 failed (No data available)[2014-10-13 17:37:31.133110] I [client.c:2229:client_rpc_notify] 0-HA-WIN-TT-1T-client-0: disconnected from 10.250.0.1:49160. Client process will keep trying to connect to glusterd until brick's port is available[2014-10-13 17:37:33.317437] W [socket.c:522:__socket_rwv] 0-HA-WIN-TT-1T-client-1: readv on 10.250.0.2:49160 failed (No data available)[2014-10-13 17:37:33.317478] I [client.c:2229:client_rpc_notify] 0-HA-WIN-TT-1T-client-1: disconnected from 10.250.0.2:49160. Client process will keep trying to connect to glusterd until brick's port is available[2014-10-13 17:37:33.317496] E [afr-common.c:4168:afr_notify] 0-HA-WIN-TT-1T-replicate-0: All subvolumes are down. Going offline until atleast one of them comes back up.[2014-10-13 17:37:42.045604] I [rpc-clnt.c:1729:rpc_clnt_reconfig] 0-HA-WIN-TT-1T-client-0: changing port to 49160 (from 0)[2014-10-13 17:37:42.046177] I [client-handshake.c:1677:select_server_supported_programs] 0-HA-WIN-TT-1T-client-0: Using Program GlusterFS 3.3, Num (1298437), Version (330)[2014-10-13 17:37:42.048863] I [client-handshake.c:1462:client_setvolume_cbk] 0-HA-WIN-TT-1T-client-0: Connected to 10.250.0.1:49160, attached to remote volume '/exports/NFS-WIN/1T'.[2014-10-13 17:37:42.048883] I [client-handshake.c:1474:client_setvolume_cbk] 0-HA-WIN-TT-1T-client-0: Server and Client lk-version numbers are not same, reopening the fds[2014-10-13 17:37:42.048897] I [client-handshake.c:1314:client_post_handshake] 0-HA-WIN-TT-1T-client-0: 1 fds open - Delaying child_up until they are re-opened[2014-10-13 17:37:42.049299] W [client-handshake.c:980:client3_3_reopen_cbk] 0-HA-WIN-TT-1T-client-0: reopen on <gfid:b00e322a-7bae-479f-91e0-1fd77c73692b> failed (Stale NFS file handle)[2014-10-13 17:37:42.049328] I [client-handshake.c:936:client_child_up_reopen_done] 0-HA-WIN-TT-1T-client-0: last fd open'd/lock-self-heal'd - notifying CHILD-UP[2014-10-13 17:37:42.049360] I [afr-common.c:4131:afr_notify] 0-HA-WIN-TT-1T-replicate-0: Subvolume 'HA-WIN-TT-1T-client-0' came back up; going online.[2014-10-13 17:37:42.049446] I [client-handshake.c:450:client_set_lk_version_cbk] 0-HA-WIN-TT-1T-client-0: Server lk version = 1[2014-10-13 17:37:45.087592] I [rpc-clnt.c:1729:rpc_clnt_reconfig] 0-HA-WIN-TT-1T-client-1: changing port to 49160 (from 0)[2014-10-13 17:37:45.088132] I [client-handshake.c:1677:select_server_supported_programs] 0-HA-WIN-TT-1T-client-1: Using Program GlusterFS 3.3, Num (1298437), Version (330)[2014-10-13 17:37:45.088343] I [client-handshake.c:1462:client_setvolume_cbk] 0-HA-WIN-TT-1T-client-1: Connected to 10.250.0.2:49160, attached to remote volume '/exports/NFS-WIN/1T'.[2014-10-13 17:37:45.088360] I [client-handshake.c:1474:client_setvolume_cbk] 0-HA-WIN-TT-1T-client-1: Server and Client lk-version numbers are not same, reopening the fds[2014-10-13 17:37:45.088373] I [client-handshake.c:1314:client_post_handshake] 0-HA-WIN-TT-1T-client-1: 1 fds open - Delaying child_up until they are re-opened[2014-10-13 17:37:45.088681] W [client-handshake.c:980:client3_3_reopen_cbk] 0-HA-WIN-TT-1T-client-1: reopen on <gfid:b00e322a-7bae-479f-91e0-1fd77c73692b> failed (Stale NFS file handle)[2014-10-13 17:37:45.088697] I [client-handshake.c:936:client_child_up_reopen_done] 0-HA-WIN-TT-1T-client-1: last fd open'd/lock-self-heal'd - notifying CHILD-UP[2014-10-13 17:37:45.088819] I [client-handshake.c:450:client_set_lk_version_cbk] 0-HA-WIN-TT-1T-client-1: Server lk version = 1[2014-10-13 17:37:54.601822] I [glusterfsd.c:1959:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.5.2 (/usr/sbin/glusterfs --direct-io-mode=enable --fuse-mountopts=default_permissions,allow_other,max_read=131072 --volfile-server=stor1 --volfile-server=stor2 --volfile-id=HA-WIN-TT-1T --fuse-mountopts=default_permissions,allow_other,max_read=131072 /srv/nfs/HA-WIN-TT-1T)[2014-10-13 17:37:54.604972] I [socket.c:3561:socket_init] 0-glusterfs: SSL support is NOT enabled[2014-10-13 17:37:54.605034] I [socket.c:3576:socket_init] 0-glusterfs: using system polling thread[2014-10-13 17:37:54.609219] I [dht-shared.c:311:dht_init_regex] 0-HA-WIN-TT-1T-dht: using regex rsync-hash-regex = ^\.(.+)\.[^.]+$[2014-10-13 17:37:54.611421] I [socket.c:3561:socket_init] 0-HA-WIN-TT-1T-client-1: SSL support is NOT enabled[2014-10-13 17:37:54.611466] I [socket.c:3576:socket_init] 0-HA-WIN-TT-1T-client-1: using system polling thread[2014-10-13 17:37:54.611808] I [socket.c:3561:socket_init] 0-HA-WIN-TT-1T-client-0: SSL support is NOT enabled[2014-10-13 17:37:54.611821] I [socket.c:3576:socket_init] 0-HA-WIN-TT-1T-client-0: using system polling thread[2014-10-13 17:37:54.611862] I [client.c:2294:notify] 0-HA-WIN-TT-1T-client-0: parent translators are ready, attempting connect on transport[2014-10-13 17:37:54.612354] I [client.c:2294:notify] 0-HA-WIN-TT-1T-client-1: parent translators are ready, attempting connect on transportFinal graph:+------------------------------------------------------------------------------+1: volume HA-WIN-TT-1T-client-02: type protocol/client3: option remote-host stor14: option remote-subvolume /exports/NFS-WIN/1T5: option transport-type socket6: option ping-timeout 107: option send-gids true8: end-volume9:10: volume HA-WIN-TT-1T-client-111: type protocol/client12: option remote-host stor213: option remote-subvolume /exports/NFS-WIN/1T14: option transport-type socket15: option ping-timeout 1016: option send-gids true17: end-volume18:19: volume HA-WIN-TT-1T-replicate-020: type cluster/replicate21: subvolumes HA-WIN-TT-1T-client-0 HA-WIN-TT-1T-client-122: end-volume23:24: volume HA-WIN-TT-1T-dht25: type cluster/distribute26: subvolumes HA-WIN-TT-1T-replicate-027: end-volume28:29: volume HA-WIN-TT-1T-write-behind30: type performance/write-behind31: subvolumes HA-WIN-TT-1T-dht32: end-volume33:34: volume HA-WIN-TT-1T-read-ahead35: type performance/read-ahead36: subvolumes HA-WIN-TT-1T-write-behind37: end-volume38:39: volume HA-WIN-TT-1T-io-cache40: type performance/io-cache41: subvolumes HA-WIN-TT-1T-read-ahead42: end-volume43:44: volume HA-WIN-TT-1T-quick-read45: type performance/quick-read46: subvolumes HA-WIN-TT-1T-io-cache47: end-volume48:49: volume HA-WIN-TT-1T-open-behind50: type performance/open-behind51: subvolumes HA-WIN-TT-1T-quick-read52: end-volume53:54: volume HA-WIN-TT-1T-md-cache55: type performance/md-cache56: subvolumes HA-WIN-TT-1T-open-behind57: end-volume58:59: volume HA-WIN-TT-1T60: type debug/io-stats61: option latency-measurement off62: option count-fop-hits off63: subvolumes HA-WIN-TT-1T-md-cache64: end-volume65:+------------------------------------------------------------------------------+[2014-10-13 17:37:54.613137] I [rpc-clnt.c:1729:rpc_clnt_reconfig] 0-HA-WIN-TT-1T-client-0: changing port to 49160 (from 0)[2014-10-13 17:37:54.613521] I [rpc-clnt.c:1729:rpc_clnt_reconfig] 0-HA-WIN-TT-1T-client-1: changing port to 49160 (from 0)[2014-10-13 17:37:54.614228] I [client-handshake.c:1677:select_server_supported_programs] 0-HA-WIN-TT-1T-client-0: Using Program GlusterFS 3.3, Num (1298437), Version (330)[2014-10-13 17:37:54.614399] I [client-handshake.c:1677:select_server_supported_programs] 0-HA-WIN-TT-1T-client-1: Using Program GlusterFS 3.3, Num (1298437), Version (330)[2014-10-13 17:37:54.614483] I [client-handshake.c:1462:client_setvolume_cbk] 0-HA-WIN-TT-1T-client-0: Connected to 10.250.0.1:49160, attached to remote volume '/exports/NFS-WIN/1T'.[2014-10-13 17:37:54.614499] I [client-handshake.c:1474:client_setvolume_cbk] 0-HA-WIN-TT-1T-client-0: Server and Client lk-version numbers are not same, reopening the fds[2014-10-13 17:37:54.614557] I [afr-common.c:4131:afr_notify] 0-HA-WIN-TT-1T-replicate-0: Subvolume 'HA-WIN-TT-1T-client-0' came back up; going online.[2014-10-13 17:37:54.614625] I [client-handshake.c:450:client_set_lk_version_cbk] 0-HA-WIN-TT-1T-client-0: Server lk version = 1[2014-10-13 17:37:54.614709] I [client-handshake.c:1462:client_setvolume_cbk] 0-HA-WIN-TT-1T-client-1: Connected to 10.250.0.2:49160, attached to remote volume '/exports/NFS-WIN/1T'.[2014-10-13 17:37:54.614724] I [client-handshake.c:1474:client_setvolume_cbk] 0-HA-WIN-TT-1T-client-1: Server and Client lk-version numbers are not same, reopening the fds[2014-10-13 17:37:54.621318] I [fuse-bridge.c:4977:fuse_graph_setup] 0-fuse: switched to graph 0[2014-10-13 17:37:54.621545] I [client-handshake.c:450:client_set_lk_version_cbk] 0-HA-WIN-TT-1T-client-1: Server lk version = 1[2014-10-13 17:37:54.621617] I [fuse-bridge.c:3914:fuse_init] 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.22 kernel 7.17[2014-10-13 17:38:25.951778] W [client-rpc-fops.c:4235:client3_3_flush] 0-HA-WIN-TT-1T-client-0: (b00e322a-7bae-479f-91e0-1fd77c73692b) remote_fd is -1. EBADFD[2014-10-13 17:38:25.951827] W [client-rpc-fops.c:4235:client3_3_flush] 0-HA-WIN-TT-1T-client-1: (b00e322a-7bae-479f-91e0-1fd77c73692b) remote_fd is -1. EBADFD[2014-10-13 17:38:25.966963] I [fuse-bridge.c:4818:fuse_thread_proc] 0-fuse: unmounting /srv/nfs/HA-WIN-TT-1T[2014-10-13 17:38:25.967174] W [glusterfsd.c:1095:cleanup_and_exit] (-->/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7ffec893de6d] (-->/lib/x86_64-linux-gnu/libpthread.so.0(+0x6b50) [0x7ffec8febb50] (-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xd5) [0x7ffeca878d55]))) 0-: received signum (15), shutting down[2014-10-13 17:38:25.967194] I [fuse-bridge.c:5475:fini] 0-fuse: Unmounting '/srv/nfs/HA-WIN-TT-1T'.[2014-10-13 17:40:21.500514] I [glusterfsd-mgmt.c:56:mgmt_cbk_spec] 0-mgmt: Volume file changed[2014-10-13 17:40:21.517782] I [glusterfsd-mgmt.c:56:mgmt_cbk_spec] 0-mgmt: Volume file changed[2014-10-13 17:40:21.524056] I [dht-shared.c:311:dht_init_regex] 0-HA-WIN-TT-1T-dht: using regex rsync-hash-regex = ^\.(.+)\.[^.]+$[2014-10-13 17:40:21.528430] I [glusterfsd-mgmt.c:1307:mgmt_getspec_cbk] 0-glusterfs: No change in volfile, continuing
glusterfshd stor1
2014-10-13 17:38:17.203360] I [glusterfsd.c:1959:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.5.2 (/usr/sbin/glusterfs -s localhost --volfile-id gluster/glustershd -p /var/lib/glusterd/glustershd/run/glustershd.pid -l /var/log/glusterfs/glustershd.log -S /var/run/75bbc77a676bde0d0afe20f40dc9e3e1.socket --xlator-option *replicate*.node-uuid=e09cbbc2-08a3-4e5b-83b8-48eb11a1c7b3)[2014-10-13 17:38:17.204958] I [socket.c:3561:socket_init] 0-socket.glusterfsd: SSL support is NOT enabled[2014-10-13 17:38:17.205016] I [socket.c:3576:socket_init] 0-socket.glusterfsd: using system polling thread[2014-10-13 17:38:17.205188] I [socket.c:3561:socket_init] 0-glusterfs: SSL support is NOT enabled[2014-10-13 17:38:17.205209] I [socket.c:3576:socket_init] 0-glusterfs: using system polling thread[2014-10-13 17:38:17.207840] I [graph.c:254:gf_add_cmdline_options] 0-HA-2TB-TT-Proxmox-cluster-replicate-0: adding option 'node-uuid' for volume 'HA-2TB-TT-Proxmox-cluster-replicate-0' with value 'e09cbbc2-08a3-4e5b-83b8-48eb11a1c7b3'[2014-10-13 17:38:17.209433] I [socket.c:3561:socket_init] 0-HA-2TB-TT-Proxmox-cluster-client-1: SSL support is NOT enabled[2014-10-13 17:38:17.209448] I [socket.c:3576:socket_init] 0-HA-2TB-TT-Proxmox-cluster-client-1: using system polling thread[2014-10-13 17:38:17.209625] I [socket.c:3561:socket_init] 0-HA-2TB-TT-Proxmox-cluster-client-0: SSL support is NOT enabled[2014-10-13 17:38:17.209634] I [socket.c:3576:socket_init] 0-HA-2TB-TT-Proxmox-cluster-client-0: using system polling thread[2014-10-13 17:38:17.209652] I [client.c:2294:notify] 0-HA-2TB-TT-Proxmox-cluster-client-0: parent translators are ready, attempting connect on transport[2014-10-13 17:38:17.210241] I [client.c:2294:notify] 0-HA-2TB-TT-Proxmox-cluster-client-1: parent translators are ready, attempting connect on transportFinal graph:+------------------------------------------------------------------------------+1: volume HA-2TB-TT-Proxmox-cluster-client-02: type protocol/client3: option remote-host stor14: option remote-subvolume /exports/HA-2TB-TT-Proxmox-cluster/2TB5: option transport-type socket6: option username 59c66122-55c1-4c28-956e-6189fcb1aff57: option password 34b79afb-a93c-431b-900a-b688e67cdbc98: option ping-timeout 109: end-volume10:11: volume HA-2TB-TT-Proxmox-cluster-client-112: type protocol/client13: option remote-host stor214: option remote-subvolume /exports/HA-2TB-TT-Proxmox-cluster/2TB15: option transport-type socket16: option username 59c66122-55c1-4c28-956e-6189fcb1aff517: option password 34b79afb-a93c-431b-900a-b688e67cdbc918: option ping-timeout 1019: end-volume20:21: volume HA-2TB-TT-Proxmox-cluster-replicate-022: type cluster/replicate23: option node-uuid e09cbbc2-08a3-4e5b-83b8-48eb11a1c7b324: option background-self-heal-count 025: option metadata-self-heal on26: option data-self-heal on27: option entry-self-heal on28: option self-heal-daemon on29: option iam-self-heal-daemon yes30: subvolumes HA-2TB-TT-Proxmox-cluster-client-0 HA-2TB-TT-Proxmox-cluster-client-131: end-volume32:33: volume glustershd34: type debug/io-stats35: subvolumes HA-2TB-TT-Proxmox-cluster-replicate-036: end-volume37:+------------------------------------------------------------------------------+[2014-10-13 17:38:17.210709] I [rpc-clnt.c:1729:rpc_clnt_reconfig] 0-HA-2TB-TT-Proxmox-cluster-client-0: changing port to 49159 (from 0)[2014-10-13 17:38:17.211008] I [client-handshake.c:1677:select_server_supported_programs] 0-HA-2TB-TT-Proxmox-cluster-client-0: Using Program GlusterFS 3.3, Num (1298437), Version (330)[2014-10-13 17:38:17.211170] I [client-handshake.c:1462:client_setvolume_cbk] 0-HA-2TB-TT-Proxmox-cluster-client-0: Connected to 10.250.0.1:49159, attached to remote volume '/exports/HA-2TB-TT-Proxmox-cluster/2TB'.[2014-10-13 17:38:17.211195] I [client-handshake.c:1474:client_setvolume_cbk] 0-HA-2TB-TT-Proxmox-cluster-client-0: Server and Client lk-version numbers are not same, reopening the fds[2014-10-13 17:38:17.211250] I [afr-common.c:4131:afr_notify] 0-HA-2TB-TT-Proxmox-cluster-replicate-0: Subvolume 'HA-2TB-TT-Proxmox-cluster-client-0' came back up; going online.[2014-10-13 17:38:17.211297] I [client-handshake.c:450:client_set_lk_version_cbk] 0-HA-2TB-TT-Proxmox-cluster-client-0: Server lk version = 1[2014-10-13 17:38:17.211656] I [afr-self-heald.c:1690:afr_dir_exclusive_crawl] 0-HA-2TB-TT-Proxmox-cluster-replicate-0: Another crawl is in progress for HA-2TB-TT-Proxmox-cluster-client-0[2014-10-13 17:38:17.211661] E [afr-self-heald.c:1479:afr_find_child_position] 0-HA-2TB-TT-Proxmox-cluster-replicate-0: getxattr failed on HA-2TB-TT-Proxmox-cluster-client-1 - (Transport endpoint is not connected)[2014-10-13 17:38:17.216327] E [afr-self-heal-data.c:1611:afr_sh_data_open_cbk] 0-HA-2TB-TT-Proxmox-cluster-replicate-0: open of <gfid:65381af4-8e0b-4721-8214-71d29dcf5237> failed on child HA-2TB-TT-Proxmox-cluster-client-1 (Transport endpoint is not connected)[2014-10-13 17:38:17.217372] E [afr-self-heal-data.c:1611:afr_sh_data_open_cbk] 0-HA-2TB-TT-Proxmox-cluster-replicate-0: open of <gfid:65381af4-8e0b-4721-8214-71d29dcf5237> failed on child HA-2TB-TT-Proxmox-cluster-client-1 (Transport endpoint is not connected)[2014-10-13 17:38:19.226057] I [rpc-clnt.c:1729:rpc_clnt_reconfig] 0-HA-2TB-TT-Proxmox-cluster-client-1: changing port to 49159 (from 0)[2014-10-13 17:38:19.226704] I [client-handshake.c:1677:select_server_supported_programs] 0-HA-2TB-TT-Proxmox-cluster-client-1: Using Program GlusterFS 3.3, Num (1298437), Version (330)[2014-10-13 17:38:19.226896] I [client-handshake.c:1462:client_setvolume_cbk] 0-HA-2TB-TT-Proxmox-cluster-client-1: Connected to 10.250.0.2:49159, attached to remote volume '/exports/HA-2TB-TT-Proxmox-cluster/2TB'.[2014-10-13 17:38:19.226916] I [client-handshake.c:1474:client_setvolume_cbk] 0-HA-2TB-TT-Proxmox-cluster-client-1: Server and Client lk-version numbers are not same, reopening the fds[2014-10-13 17:38:19.227031] I [client-handshake.c:450:client_set_lk_version_cbk] 0-HA-2TB-TT-Proxmox-cluster-client-1: Server lk version = 1[2014-10-13 17:38:25.933950] W [glusterfsd.c:1095:cleanup_and_exit] (-->/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7f1a7c03ce6d] (-->/lib/x86_64-linux-gnu/libpthread.so.0(+0x6b50) [0x7f1a7c6eab50] (-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xd5) [0x7f1a7df77d55]))) 0-: received signum (15), shutting down[2014-10-13 17:38:26.942918] I [glusterfsd.c:1959:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.5.2 (/usr/sbin/glusterfs -s localhost --volfile-id gluster/glustershd -p /var/lib/glusterd/glustershd/run/glustershd.pid -l /var/log/glusterfs/glustershd.log -S /var/run/75bbc77a676bde0d0afe20f40dc9e3e1.socket --xlator-option *replicate*.node-uuid=e09cbbc2-08a3-4e5b-83b8-48eb11a1c7b3)[2014-10-13 17:38:26.944548] I [socket.c:3561:socket_init] 0-socket.glusterfsd: SSL support is NOT enabled[2014-10-13 17:38:26.944584] I [socket.c:3576:socket_init] 0-socket.glusterfsd: using system polling thread[2014-10-13 17:38:26.944689] I [socket.c:3561:socket_init] 0-glusterfs: SSL support is NOT enabled[2014-10-13 17:38:26.944701] I [socket.c:3576:socket_init] 0-glusterfs: using system polling thread[2014-10-13 17:38:26.946667] I [graph.c:254:gf_add_cmdline_options] 0-HA-2TB-TT-Proxmox-cluster-replicate-0: adding option 'node-uuid' for volume 'HA-2TB-TT-Proxmox-cluster-replicate-0' with value 'e09cbbc2-08a3-4e5b-83b8-48eb11a1c7b3'[2014-10-13 17:38:26.946684] I [graph.c:254:gf_add_cmdline_options] 0-HA-WIN-TT-1T-replicate-0: adding option 'node-uuid' for volume 'HA-WIN-TT-1T-replicate-0' with value 'e09cbbc2-08a3-4e5b-83b8-48eb11a1c7b3'[2014-10-13 17:38:26.948783] I [socket.c:3561:socket_init] 0-HA-2TB-TT-Proxmox-cluster-client-1: SSL support is NOT enabled[2014-10-13 17:38:26.948809] I [socket.c:3576:socket_init] 0-HA-2TB-TT-Proxmox-cluster-client-1: using system polling thread[2014-10-13 17:38:26.949118] I [socket.c:3561:socket_init] 0-HA-2TB-TT-Proxmox-cluster-client-0: SSL support is NOT enabled[2014-10-13 17:38:26.949134] I [socket.c:3576:socket_init] 0-HA-2TB-TT-Proxmox-cluster-client-0: using system polling thread[2014-10-13 17:38:26.951698] I [socket.c:3561:socket_init] 0-HA-WIN-TT-1T-client-1: SSL support is NOT enabled[2014-10-13 17:38:26.951715] I [socket.c:3576:socket_init] 0-HA-WIN-TT-1T-client-1: using system polling thread[2014-10-13 17:38:26.951921] I [socket.c:3561:socket_init] 0-HA-WIN-TT-1T-client-0: SSL support is NOT enabled[2014-10-13 17:38:26.951932] I [socket.c:3576:socket_init] 0-HA-WIN-TT-1T-client-0: using system polling thread[2014-10-13 17:38:26.951959] I [client.c:2294:notify] 0-HA-2TB-TT-Proxmox-cluster-client-0: parent translators are ready, attempting connect on transport[2014-10-13 17:38:26.952612] I [client.c:2294:notify] 0-HA-2TB-TT-Proxmox-cluster-client-1: parent translators are ready, attempting connect on transport[2014-10-13 17:38:26.952862] I [client.c:2294:notify] 0-HA-WIN-TT-1T-client-0: parent translators are ready, attempting connect on transport[2014-10-13 17:38:26.953447] I [client.c:2294:notify] 0-HA-WIN-TT-1T-client-1: parent translators are ready, attempting connect on transportFinal graph:+------------------------------------------------------------------------------+1: volume HA-2TB-TT-Proxmox-cluster-client-02: type protocol/client3: option remote-host stor14: option remote-subvolume /exports/HA-2TB-TT-Proxmox-cluster/2TB5: option transport-type socket6: option username 59c66122-55c1-4c28-956e-6189fcb1aff57: option password 34b79afb-a93c-431b-900a-b688e67cdbc98: option ping-timeout 109: end-volume10:11: volume HA-2TB-TT-Proxmox-cluster-client-112: type protocol/client13: option remote-host stor214: option remote-subvolume /exports/HA-2TB-TT-Proxmox-cluster/2TB15: option transport-type socket16: option username 59c66122-55c1-4c28-956e-6189fcb1aff517: option password 34b79afb-a93c-431b-900a-b688e67cdbc918: option ping-timeout 1019: end-volume20:21: volume HA-2TB-TT-Proxmox-cluster-replicate-022: type cluster/replicate23: option node-uuid e09cbbc2-08a3-4e5b-83b8-48eb11a1c7b324: option background-self-heal-count 025: option metadata-self-heal on26: option data-self-heal on27: option entry-self-heal on28: option self-heal-daemon on29: option iam-self-heal-daemon yes30: subvolumes HA-2TB-TT-Proxmox-cluster-client-0 HA-2TB-TT-Proxmox-cluster-client-131: end-volume32:33: volume HA-WIN-TT-1T-client-034: type protocol/client35: option remote-host stor136: option remote-subvolume /exports/NFS-WIN/1T37: option transport-type socket38: option username 101b907c-ff21-47da-8ba6-37e2920691ce39: option password f4f29094-891f-4241-8736-5e3302ed8bc840: option ping-timeout 1041: end-volume42:43: volume HA-WIN-TT-1T-client-144: type protocol/client45: option remote-host stor246: option remote-subvolume /exports/NFS-WIN/1T47: option transport-type socket48: option username 101b907c-ff21-47da-8ba6-37e2920691ce49: option password f4f29094-891f-4241-8736-5e3302ed8bc850: option ping-timeout 1051: end-volume52:53: volume HA-WIN-TT-1T-replicate-054: type cluster/replicate55: option node-uuid e09cbbc2-08a3-4e5b-83b8-48eb11a1c7b356: option background-self-heal-count 057: option metadata-self-heal on58: option data-self-heal on59: option entry-self-heal on60: option self-heal-daemon on61: option iam-self-heal-daemon yes62: subvolumes HA-WIN-TT-1T-client-0 HA-WIN-TT-1T-client-163: end-volume64:65: volume glustershd66: type debug/io-stats67: subvolumes HA-2TB-TT-Proxmox-cluster-replicate-0 HA-WIN-TT-1T-replicate-068: end-volume69:+------------------------------------------------------------------------------+[2014-10-13 17:38:26.954036] I [rpc-clnt.c:1729:rpc_clnt_reconfig] 0-HA-2TB-TT-Proxmox-cluster-client-0: changing port to 49159 (from 0)[2014-10-13 17:38:26.954308] I [rpc-clnt.c:1729:rpc_clnt_reconfig] 0-HA-WIN-TT-1T-client-0: changing port to 49160 (from 0)[2014-10-13 17:38:26.954741] I [client-handshake.c:1677:select_server_supported_programs] 0-HA-2TB-TT-Proxmox-cluster-client-0: Using Program GlusterFS 3.3, Num (1298437), Version (330)[2014-10-13 17:38:26.954815] I [client-handshake.c:1677:select_server_supported_programs] 0-HA-WIN-TT-1T-client-0: Using Program GlusterFS 3.3, Num (1298437), Version (330)[2014-10-13 17:38:26.954999] I [client-handshake.c:1462:client_setvolume_cbk] 0-HA-2TB-TT-Proxmox-cluster-client-0: Connected to 10.250.0.1:49159, attached to remote volume '/exports/HA-2TB-TT-Proxmox-cluster/2TB'.[2014-10-13 17:38:26.955017] I [client-handshake.c:1474:client_setvolume_cbk] 0-HA-2TB-TT-Proxmox-cluster-client-0: Server and Client lk-version numbers are not same, reopening the fds[2014-10-13 17:38:26.955073] I [afr-common.c:4131:afr_notify] 0-HA-2TB-TT-Proxmox-cluster-replicate-0: Subvolume 'HA-2TB-TT-Proxmox-cluster-client-0' came back up; going online.[2014-10-13 17:38:26.955127] I [client-handshake.c:450:client_set_lk_version_cbk] 0-HA-2TB-TT-Proxmox-cluster-client-0: Server lk version = 1[2014-10-13 17:38:26.955151] I [client-handshake.c:1462:client_setvolume_cbk] 0-HA-WIN-TT-1T-client-0: Connected to 10.250.0.1:49160, attached to remote volume '/exports/NFS-WIN/1T'.[2014-10-13 17:38:26.955161] I [client-handshake.c:1474:client_setvolume_cbk] 0-HA-WIN-TT-1T-client-0: Server and Client lk-version numbers are not same, reopening the fds[2014-10-13 17:38:26.955226] I [afr-common.c:4131:afr_notify] 0-HA-WIN-TT-1T-replicate-0: Subvolume 'HA-WIN-TT-1T-client-0' came back up; going online.[2014-10-13 17:38:26.955297] I [client-handshake.c:450:client_set_lk_version_cbk] 0-HA-WIN-TT-1T-client-0: Server lk version = 1[2014-10-13 17:38:26.955583] I [afr-self-heald.c:1690:afr_dir_exclusive_crawl] 0-HA-2TB-TT-Proxmox-cluster-replicate-0: Another crawl is in progress for HA-2TB-TT-Proxmox-cluster-client-0[2014-10-13 17:38:26.955589] E [afr-self-heald.c:1479:afr_find_child_position] 0-HA-2TB-TT-Proxmox-cluster-replicate-0: getxattr failed on HA-2TB-TT-Proxmox-cluster-client-1 - (Transport endpoint is not connected)[2014-10-13 17:38:26.955832] I [afr-self-heald.c:1690:afr_dir_exclusive_crawl] 0-HA-WIN-TT-1T-replicate-0: Another crawl is in progress for HA-WIN-TT-1T-client-0[2014-10-13 17:38:26.955858] E [afr-self-heald.c:1479:afr_find_child_position] 0-HA-WIN-TT-1T-replicate-0: getxattr failed on HA-WIN-TT-1T-client-1 - (Transport endpoint is not connected)[2014-10-13 17:38:26.964913] I [rpc-clnt.c:1729:rpc_clnt_reconfig] 0-HA-2TB-TT-Proxmox-cluster-client-1: changing port to 49159 (from 0)[2014-10-13 17:38:26.965553] I [client-handshake.c:1677:select_server_supported_programs] 0-HA-2TB-TT-Proxmox-cluster-client-1: Using Program GlusterFS 3.3, Num (1298437), Version (330)[2014-10-13 17:38:26.965794] I [client-handshake.c:1462:client_setvolume_cbk] 0-HA-2TB-TT-Proxmox-cluster-client-1: Connected to 10.250.0.2:49159, attached to remote volume '/exports/HA-2TB-TT-Proxmox-cluster/2TB'.[2014-10-13 17:38:26.965815] I [client-handshake.c:1474:client_setvolume_cbk] 0-HA-2TB-TT-Proxmox-cluster-client-1: Server and Client lk-version numbers are not same, reopening the fds[2014-10-13 17:38:26.965968] I [client-handshake.c:450:client_set_lk_version_cbk] 0-HA-2TB-TT-Proxmox-cluster-client-1: Server lk version = 1[2014-10-13 17:38:26.967510] I [afr-self-heald.c:1690:afr_dir_exclusive_crawl] 0-HA-2TB-TT-Proxmox-cluster-replicate-0: Another crawl is in progress for HA-2TB-TT-Proxmox-cluster-client-0[2014-10-13 17:38:27.971374] I [rpc-clnt.c:1729:rpc_clnt_reconfig] 0-HA-WIN-TT-1T-client-1: changing port to 49160 (from 0)[2014-10-13 17:38:27.971940] I [client-handshake.c:1677:select_server_supported_programs] 0-HA-WIN-TT-1T-client-1: Using Program GlusterFS 3.3, Num (1298437), Version (330)[2014-10-13 17:38:27.975460] I [client-handshake.c:1462:client_setvolume_cbk] 0-HA-WIN-TT-1T-client-1: Connected to 10.250.0.2:49160, attached to remote volume '/exports/NFS-WIN/1T'.[2014-10-13 17:38:27.975481] I [client-handshake.c:1474:client_setvolume_cbk] 0-HA-WIN-TT-1T-client-1: Server and Client lk-version numbers are not same, reopening the fds[2014-10-13 17:38:27.976656] I [client-handshake.c:450:client_set_lk_version_cbk] 0-HA-WIN-TT-1T-client-1: Server lk version = 1[2014-10-13 17:41:05.390992] I [glusterfsd-mgmt.c:56:mgmt_cbk_spec] 0-mgmt: Volume file changed[2014-10-13 17:41:05.408292] I [glusterfsd-mgmt.c:56:mgmt_cbk_spec] 0-mgmt: Volume file changed[2014-10-13 17:41:05.412221] I [glusterfsd-mgmt.c:1307:mgmt_getspec_cbk] 0-glusterfs: No change in volfile, continuing[2014-10-13 17:41:05.417388] I [glusterfsd-mgmt.c:1307:mgmt_getspec_cbk] 0-glusterfs: No change in volfile, continuingroot@stor1:~#
glusterfshd stor2
[2014-10-13 17:38:28.992891] I [glusterfsd.c:1959:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.5.2 (/usr/sbin/glusterfs -s localhost --volfile-id gluster/glustershd -p /var/lib/glusterd/glustershd/run/glustershd.pid -l /var/log/glusterfs/glustershd.log -S /var/run/b1494ca4d047df6e8590d7080131908f.socket --xlator-option *replicate*.node-uuid=abf9e3a7-eb91-4273-acdf-876cd6ba1fe3)[2014-10-13 17:38:28.994439] I [socket.c:3561:socket_init] 0-socket.glusterfsd: SSL support is NOT enabled[2014-10-13 17:38:28.994476] I [socket.c:3576:socket_init] 0-socket.glusterfsd: using system polling thread[2014-10-13 17:38:28.994581] I [socket.c:3561:socket_init] 0-glusterfs: SSL support is NOT enabled[2014-10-13 17:38:28.994594] I [socket.c:3576:socket_init] 0-glusterfs: using system polling thread[2014-10-13 17:38:28.996569] I [graph.c:254:gf_add_cmdline_options] 0-HA-2TB-TT-Proxmox-cluster-replicate-0: adding option 'node-uuid' for volume 'HA-2TB-TT-Proxmox-cluster-replicate-0' with value 'abf9e3a7-eb91-4273-acdf-876cd6ba1fe3'[2014-10-13 17:38:28.996585] I [graph.c:254:gf_add_cmdline_options] 0-HA-WIN-TT-1T-replicate-0: adding option 'node-uuid' for volume 'HA-WIN-TT-1T-replicate-0' with value 'abf9e3a7-eb91-4273-acdf-876cd6ba1fe3'[2014-10-13 17:38:28.998463] I [socket.c:3561:socket_init] 0-HA-2TB-TT-Proxmox-cluster-client-1: SSL support is NOT enabled[2014-10-13 17:38:28.998483] I [socket.c:3576:socket_init] 0-HA-2TB-TT-Proxmox-cluster-client-1: using system polling thread[2014-10-13 17:38:28.998695] I [socket.c:3561:socket_init] 0-HA-2TB-TT-Proxmox-cluster-client-0: SSL support is NOT enabled[2014-10-13 17:38:28.998707] I [socket.c:3576:socket_init] 0-HA-2TB-TT-Proxmox-cluster-client-0: using system polling thread[2014-10-13 17:38:29.000506] I [socket.c:3561:socket_init] 0-HA-WIN-TT-1T-client-1: SSL support is NOT enabled[2014-10-13 17:38:29.000520] I [socket.c:3576:socket_init] 0-HA-WIN-TT-1T-client-1: using system polling thread[2014-10-13 17:38:29.000723] I [socket.c:3561:socket_init] 0-HA-WIN-TT-1T-client-0: SSL support is NOT enabled[2014-10-13 17:38:29.000734] I [socket.c:3576:socket_init] 0-HA-WIN-TT-1T-client-0: using system polling thread[2014-10-13 17:38:29.000762] I [client.c:2294:notify] 0-HA-2TB-TT-Proxmox-cluster-client-0: parent translators are ready, attempting connect on transport[2014-10-13 17:38:29.001064] I [client.c:2294:notify] 0-HA-2TB-TT-Proxmox-cluster-client-1: parent translators are ready, attempting connect on transport[2014-10-13 17:38:29.001639] I [client.c:2294:notify] 0-HA-WIN-TT-1T-client-0: parent translators are ready, attempting connect on transport[2014-10-13 17:38:29.001877] I [client.c:2294:notify] 0-HA-WIN-TT-1T-client-1: parent translators are ready, attempting connect on transportFinal graph:+------------------------------------------------------------------------------+1: volume HA-2TB-TT-Proxmox-cluster-client-02: type protocol/client3: option remote-host stor14: option remote-subvolume /exports/HA-2TB-TT-Proxmox-cluster/2TB5: option transport-type socket6: option username 59c66122-55c1-4c28-956e-6189fcb1aff57: option password 34b79afb-a93c-431b-900a-b688e67cdbc98: option ping-timeout 109: end-volume10:11: volume HA-2TB-TT-Proxmox-cluster-client-112: type protocol/client13: option remote-host stor214: option remote-subvolume /exports/HA-2TB-TT-Proxmox-cluster/2TB15: option transport-type socket16: option username 59c66122-55c1-4c28-956e-6189fcb1aff517: option password 34b79afb-a93c-431b-900a-b688e67cdbc918: option ping-timeout 1019: end-volume20:21: volume HA-2TB-TT-Proxmox-cluster-replicate-022: type cluster/replicate23: option node-uuid abf9e3a7-eb91-4273-acdf-876cd6ba1fe324: option background-self-heal-count 025: option metadata-self-heal on26: option data-self-heal on27: option entry-self-heal on28: option self-heal-daemon on29: option iam-self-heal-daemon yes30: subvolumes HA-2TB-TT-Proxmox-cluster-client-0 HA-2TB-TT-Proxmox-cluster-client-131: end-volume32:33: volume HA-WIN-TT-1T-client-034: type protocol/client35: option remote-host stor136: option remote-subvolume /exports/NFS-WIN/1T37: option transport-type socket38: option username 101b907c-ff21-47da-8ba6-37e2920691ce39: option password f4f29094-891f-4241-8736-5e3302ed8bc840: option ping-timeout 1041: end-volume42:43: volume HA-WIN-TT-1T-client-144: type protocol/client45: option remote-host stor246: option remote-subvolume /exports/NFS-WIN/1T47: option transport-type socket48: option username 101b907c-ff21-47da-8ba6-37e2920691ce49: option password f4f29094-891f-4241-8736-5e3302ed8bc850: option ping-timeout 1051: end-volume52:53: volume HA-WIN-TT-1T-replicate-054: type cluster/replicate55: option node-uuid abf9e3a7-eb91-4273-acdf-876cd6ba1fe356: option background-self-heal-count 057: option metadata-self-heal on58: option data-self-heal on59: option entry-self-heal on60: option self-heal-daemon on61: option iam-self-heal-daemon yes62: subvolumes HA-WIN-TT-1T-client-0 HA-WIN-TT-1T-client-163: end-volume64:65: volume glustershd66: type debug/io-stats67: subvolumes HA-2TB-TT-Proxmox-cluster-replicate-0 HA-WIN-TT-1T-replicate-068: end-volume69:+------------------------------------------------------------------------------+[2014-10-13 17:38:29.002743] I [rpc-clnt.c:1729:rpc_clnt_reconfig] 0-HA-2TB-TT-Proxmox-cluster-client-1: changing port to 49159 (from 0)[2014-10-13 17:38:29.003027] I [rpc-clnt.c:1729:rpc_clnt_reconfig] 0-HA-WIN-TT-1T-client-1: changing port to 49160 (from 0)[2014-10-13 17:38:29.003290] I [rpc-clnt.c:1729:rpc_clnt_reconfig] 0-HA-2TB-TT-Proxmox-cluster-client-0: changing port to 49159 (from 0)[2014-10-13 17:38:29.003334] I [rpc-clnt.c:1729:rpc_clnt_reconfig] 0-HA-WIN-TT-1T-client-0: changing port to 49160 (from 0)[2014-10-13 17:38:29.003922] I [client-handshake.c:1677:select_server_supported_programs] 0-HA-2TB-TT-Proxmox-cluster-client-1: Using Program GlusterFS 3.3, Num (1298437), Version (330)[2014-10-13 17:38:29.004023] I [client-handshake.c:1677:select_server_supported_programs] 0-HA-WIN-TT-1T-client-1: Using Program GlusterFS 3.3, Num (1298437), Version (330)[2014-10-13 17:38:29.004139] I [client-handshake.c:1677:select_server_supported_programs] 0-HA-2TB-TT-Proxmox-cluster-client-0: Using Program GlusterFS 3.3, Num (1298437), Version (330)[2014-10-13 17:38:29.004202] I [client-handshake.c:1462:client_setvolume_cbk] 0-HA-2TB-TT-Proxmox-cluster-client-1: Connected to 10.250.0.2:49159, attached to remote volume '/exports/HA-2TB-TT-Proxmox-cluster/2TB'.[2014-10-13 17:38:29.004217] I [client-handshake.c:1474:client_setvolume_cbk] 0-HA-2TB-TT-Proxmox-cluster-client-1: Server and Client lk-version numbers are not same, reopening the fds[2014-10-13 17:38:29.004266] I [afr-common.c:4131:afr_notify] 0-HA-2TB-TT-Proxmox-cluster-replicate-0: Subvolume 'HA-2TB-TT-Proxmox-cluster-client-1' came back up; going online.[2014-10-13 17:38:29.004318] I [client-handshake.c:1677:select_server_supported_programs] 0-HA-WIN-TT-1T-client-0: Using Program GlusterFS 3.3, Num (1298437), Version (330)[2014-10-13 17:38:29.004368] I [client-handshake.c:1462:client_setvolume_cbk] 0-HA-WIN-TT-1T-client-1: Connected to 10.250.0.2:49160, attached to remote volume '/exports/NFS-WIN/1T'.[2014-10-13 17:38:29.004383] I [client-handshake.c:1474:client_setvolume_cbk] 0-HA-WIN-TT-1T-client-1: Server and Client lk-version numbers are not same, reopening the fds[2014-10-13 17:38:29.004429] I [afr-common.c:4131:afr_notify] 0-HA-WIN-TT-1T-replicate-0: Subvolume 'HA-WIN-TT-1T-client-1' came back up; going online.[2014-10-13 17:38:29.004483] I [client-handshake.c:450:client_set_lk_version_cbk] 0-HA-2TB-TT-Proxmox-cluster-client-1: Server lk version = 1[2014-10-13 17:38:29.004506] I [client-handshake.c:450:client_set_lk_version_cbk] 0-HA-WIN-TT-1T-client-1: Server lk version = 1[2014-10-13 17:38:29.004526] I [client-handshake.c:1462:client_setvolume_cbk] 0-HA-2TB-TT-Proxmox-cluster-client-0: Connected to 10.250.0.1:49159, attached to remote volume '/exports/HA-2TB-TT-Proxmox-cluster/2TB'.[2014-10-13 17:38:29.004535] I [client-handshake.c:1474:client_setvolume_cbk] 0-HA-2TB-TT-Proxmox-cluster-client-0: Server and Client lk-version numbers are not same, reopening the fds[2014-10-13 17:38:29.004613] I [client-handshake.c:1462:client_setvolume_cbk] 0-HA-WIN-TT-1T-client-0: Connected to 10.250.0.1:49160, attached to remote volume '/exports/NFS-WIN/1T'.[2014-10-13 17:38:29.004626] I [client-handshake.c:1474:client_setvolume_cbk] 0-HA-WIN-TT-1T-client-0: Server and Client lk-version numbers are not same, reopening the fds[2014-10-13 17:38:29.004731] I [client-handshake.c:450:client_set_lk_version_cbk] 0-HA-2TB-TT-Proxmox-cluster-client-0: Server lk version = 1[2014-10-13 17:38:29.004796] I [client-handshake.c:450:client_set_lk_version_cbk] 0-HA-WIN-TT-1T-client-0: Server lk version = 1[2014-10-13 17:38:29.005291] I [afr-self-heald.c:1690:afr_dir_exclusive_crawl] 0-HA-WIN-TT-1T-replicate-0: Another crawl is in progress for HA-WIN-TT-1T-client-1[2014-10-13 17:38:29.005303] I [afr-self-heald.c:1690:afr_dir_exclusive_crawl] 0-HA-2TB-TT-Proxmox-cluster-replicate-0: Another crawl is in progress for HA-2TB-TT-Proxmox-cluster-client-1[2014-10-13 17:38:29.005443] I [afr-self-heald.c:1690:afr_dir_exclusive_crawl] 0-HA-2TB-TT-Proxmox-cluster-replicate-0: Another crawl is in progress for HA-2TB-TT-Proxmox-cluster-client-1[2014-10-13 17:41:05.427867] I [glusterfsd-mgmt.c:56:mgmt_cbk_spec] 0-mgmt: Volume file changed[2014-10-13 17:41:05.443271] I [glusterfsd-mgmt.c:56:mgmt_cbk_spec] 0-mgmt: Volume file changed[2014-10-13 17:41:05.444111] I [glusterfsd-mgmt.c:1307:mgmt_getspec_cbk] 0-glusterfs: No change in volfile, continuing[2014-10-13 17:41:05.444807] I [glusterfsd-mgmt.c:1307:mgmt_getspec_cbk] 0-glusterfs: No change in volfile, continuing
brick stor2
[2014-10-13 17:38:17.213386] W [glusterfsd.c:1095:cleanup_and_exit] (-->/lib/x86_64-linux-gnu/libc.so.6(+0x462a0) [0x7f343271f2a0] (-->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(synctask_wrap+0x12) [0x7f343371db12] (-->/usr/sbin/glusterfsd(glusterfs_handle_terminate+0x15) [0x7f3434790dd5]))) 0-: received signum (15), shutting down[2014-10-13 17:38:26.957312] I [glusterfsd.c:1959:main] 0-/usr/sbin/glusterfsd: Started running /usr/sbin/glusterfsd version 3.5.2 (/usr/sbin/glusterfsd -s stor2 --volfile-id HA-WIN-TT-1T.stor2.exports-NFS-WIN-1T -p /var/lib/glusterd/vols/HA-WIN-TT-1T/run/stor2-exports-NFS-WIN-1T.pid -S /var/run/91514691033d00e666bb151f9c771a26.socket --brick-name /exports/NFS-WIN/1T -l /var/log/glusterfs/bricks/exports-NFS-WIN-1T.log --xlator-option *-posix.glusterd-uuid=abf9e3a7-eb91-4273-acdf-876cd6ba1fe3 --brick-port 49160 --xlator-option HA-WIN-TT-1T-server.listen-port=49160)[2014-10-13 17:38:26.958864] I [socket.c:3561:socket_init] 0-socket.glusterfsd: SSL support is NOT enabled[2014-10-13 17:38:26.958899] I [socket.c:3576:socket_init] 0-socket.glusterfsd: using system polling thread[2014-10-13 17:38:26.959003] I [socket.c:3561:socket_init] 0-glusterfs: SSL support is NOT enabled[2014-10-13 17:38:26.959015] I [socket.c:3576:socket_init] 0-glusterfs: using system polling thread[2014-10-13 17:38:26.961860] I [graph.c:254:gf_add_cmdline_options] 0-HA-WIN-TT-1T-server: adding option 'listen-port' for volume 'HA-WIN-TT-1T-server' with value '49160'[2014-10-13 17:38:26.961878] I [graph.c:254:gf_add_cmdline_options] 0-HA-WIN-TT-1T-posix: adding option 'glusterd-uuid' for volume 'HA-WIN-TT-1T-posix' with value 'abf9e3a7-eb91-4273-acdf-876cd6ba1fe3'[2014-10-13 17:38:26.965032] I [rpcsvc.c:2127:rpcsvc_set_outstanding_rpc_limit] 0-rpc-service: Configured rpc.outstanding-rpc-limit with value 64[2014-10-13 17:38:26.965075] W [options.c:888:xl_opt_validate] 0-HA-WIN-TT-1T-server: option 'listen-port' is deprecated, preferred is 'transport.socket.listen-port', continuing with correction[2014-10-13 17:38:26.965097] I [socket.c:3561:socket_init] 0-tcp.HA-WIN-TT-1T-server: SSL support is NOT enabled[2014-10-13 17:38:26.965105] I [socket.c:3576:socket_init] 0-tcp.HA-WIN-TT-1T-server: using system polling thread[2014-10-13 17:38:26.965602] W [graph.c:329:_log_if_unknown_option] 0-HA-WIN-TT-1T-quota: option 'timeout' is not recognizedFinal graph:+------------------------------------------------------------------------------+1: volume HA-WIN-TT-1T-posix2: type storage/posix3: option glusterd-uuid abf9e3a7-eb91-4273-acdf-876cd6ba1fe34: option directory /exports/NFS-WIN/1T5: option volume-id 2937ac01-4cba-44a8-8ff8-0161b67f8ee46: end-volume7:8: volume HA-WIN-TT-1T-changelog9: type features/changelog10: option changelog-brick /exports/NFS-WIN/1T11: option changelog-dir /exports/NFS-WIN/1T/.glusterfs/changelogs12: subvolumes HA-WIN-TT-1T-posix13: end-volume14:15: volume HA-WIN-TT-1T-access-control16: type features/access-control17: subvolumes HA-WIN-TT-1T-changelog18: end-volume19:20: volume HA-WIN-TT-1T-locks21: type features/locks22: subvolumes HA-WIN-TT-1T-access-control23: end-volume24:25: volume HA-WIN-TT-1T-io-threads26: type performance/io-threads27: subvolumes HA-WIN-TT-1T-locks28: end-volume29:30: volume HA-WIN-TT-1T-index31: type features/index32: option index-base /exports/NFS-WIN/1T/.glusterfs/indices33: subvolumes HA-WIN-TT-1T-io-threads34: end-volume35:36: volume HA-WIN-TT-1T-marker37: type features/marker38: option volume-uuid 2937ac01-4cba-44a8-8ff8-0161b67f8ee439: option timestamp-file /var/lib/glusterd/vols/HA-WIN-TT-1T/marker.tstamp40: option xtime off41: option gsync-force-xtime off42: option quota off43: subvolumes HA-WIN-TT-1T-index44: end-volume45:46: volume HA-WIN-TT-1T-quota47: type features/quota48: option volume-uuid HA-WIN-TT-1T49: option server-quota off50: option timeout 051: option deem-statfs off52: subvolumes HA-WIN-TT-1T-marker53: end-volume54:55: volume /exports/NFS-WIN/1T56: type debug/io-stats57: option latency-measurement off58: option count-fop-hits off59: subvolumes HA-WIN-TT-1T-quota60: end-volume61:62: volume HA-WIN-TT-1T-server63: type protocol/server64: option transport.socket.listen-port 4916065: option rpc-auth.auth-glusterfs on66: option rpc-auth.auth-unix on67: option rpc-auth.auth-null on68: option transport-type tcp69: option auth.login./exports/NFS-WIN/1T.allow 101b907c-ff21-47da-8ba6-37e2920691ce70: option auth.login.101b907c-ff21-47da-8ba6-37e2920691ce.password f4f29094-891f-4241-8736-5e3302ed8bc871: option auth.addr./exports/NFS-WIN/1T.allow *72: subvolumes /exports/NFS-WIN/1T73: end-volume74:+------------------------------------------------------------------------------+[2014-10-13 17:38:27.985048] I [server-handshake.c:575:server_setvolume] 0-HA-WIN-TT-1T-server: accepted client from stor1-14362-2014/10/13-17:38:26:938194-HA-WIN-TT-1T-client-1-0-0 (version: 3.5.2)[2014-10-13 17:38:28.988700] I [server-handshake.c:575:server_setvolume] 0-HA-WIN-TT-1T-server: accepted client from glstor-cli-20753-2014/10/13-11:50:40:959211-HA-WIN-TT-1T-client-1-0-1 (version: 3.5.2)[2014-10-13 17:38:29.004121] I [server-handshake.c:575:server_setvolume] 0-HA-WIN-TT-1T-server: accepted client from stor2-15494-2014/10/13-17:38:28:989227-HA-WIN-TT-1T-client-1-0-0 (version: 3.5.2)[2014-10-13 17:38:38.515315] I [server-handshake.c:575:server_setvolume] 0-HA-WIN-TT-1T-server: accepted client from glstor-cli-23823-2014/10/13-17:37:54:595571-HA-WIN-TT-1T-client-1-0-0 (version: 3.5.2)[2014-10-13 17:39:09.872223] I [server.c:520:server_rpc_notify] 0-HA-WIN-TT-1T-server: disconnecting connectionfrom glstor-cli-20753-2014/10/13-11:50:40:959211-HA-WIN-TT-1T-client-1-0-1[2014-10-13 17:39:09.872299] I [client_t.c:417:gf_client_unref] 0-HA-WIN-TT-1T-server: Shutting down connection glstor-cli-20753-2014/10/13-11:50:40:959211-HA-WIN-TT-1T-client-1-0-1[2014-10-13 17:41:05.427810] I [glusterfsd-mgmt.c:56:mgmt_cbk_spec] 0-mgmt: Volume file changed[2014-10-13 17:41:05.443234] I [glusterfsd-mgmt.c:56:mgmt_cbk_spec] 0-mgmt: Volume file changed[2014-10-13 17:41:05.445049] I [glusterfsd-mgmt.c:1307:mgmt_getspec_cbk] 0-glusterfs: No change in volfile, continuingroot@stor2:~#
brick stor1
[2014-10-13 17:38:24.900066] I [glusterfsd.c:1959:main] 0-/usr/sbin/glusterfsd: Started running /usr/sbin/glusterfsd version 3.5.2 (/usr/sbin/glusterfsd -s stor1 --volfile-id HA-WIN-TT-1T.stor1.exports-NFS-WIN-1T -p /var/lib/glusterd/vols/HA-WIN-TT-1T/run/stor1-exports-NFS-WIN-1T.pid -S /var/run/02580c93278849804f3f34f7ed8314b2.socket --brick-name /exports/NFS-WIN/1T -l /var/log/glusterfs/bricks/exports-NFS-WIN-1T.log --xlator-option *-posix.glusterd-uuid=e09cbbc2-08a3-4e5b-83b8-48eb11a1c7b3 --brick-port 49160 --xlator-option HA-WIN-TT-1T-server.listen-port=49160)[2014-10-13 17:38:24.902022] I [socket.c:3561:socket_init] 0-socket.glusterfsd: SSL support is NOT enabled[2014-10-13 17:38:24.902077] I [socket.c:3576:socket_init] 0-socket.glusterfsd: using system polling thread[2014-10-13 17:38:24.902214] I [socket.c:3561:socket_init] 0-glusterfs: SSL support is NOT enabled[2014-10-13 17:38:24.902239] I [socket.c:3576:socket_init] 0-glusterfs: using system polling thread[2014-10-13 17:38:24.906698] I [graph.c:254:gf_add_cmdline_options] 0-HA-WIN-TT-1T-server: adding option 'listen-port' for volume 'HA-WIN-TT-1T-server' with value '49160'[2014-10-13 17:38:24.906731] I [graph.c:254:gf_add_cmdline_options] 0-HA-WIN-TT-1T-posix: adding option 'glusterd-uuid' for volume 'HA-WIN-TT-1T-posix' with value 'e09cbbc2-08a3-4e5b-83b8-48eb11a1c7b3'[2014-10-13 17:38:24.908378] I [rpcsvc.c:2127:rpcsvc_set_outstanding_rpc_limit] 0-rpc-service: Configured rpc.outstanding-rpc-limit with value 64[2014-10-13 17:38:24.908435] W [options.c:888:xl_opt_validate] 0-HA-WIN-TT-1T-server: option 'listen-port' is deprecated, preferred is 'transport.socket.listen-port', continuing with correction[2014-10-13 17:38:24.908472] I [socket.c:3561:socket_init] 0-tcp.HA-WIN-TT-1T-server: SSL support is NOT enabled[2014-10-13 17:38:24.908485] I [socket.c:3576:socket_init] 0-tcp.HA-WIN-TT-1T-server: using system polling thread[2014-10-13 17:38:24.909105] W [graph.c:329:_log_if_unknown_option] 0-HA-WIN-TT-1T-quota: option 'timeout' is not recognizedFinal graph:+------------------------------------------------------------------------------+1: volume HA-WIN-TT-1T-posix2: type storage/posix3: option glusterd-uuid e09cbbc2-08a3-4e5b-83b8-48eb11a1c7b34: option directory /exports/NFS-WIN/1T5: option volume-id 2937ac01-4cba-44a8-8ff8-0161b67f8ee46: end-volume7:8: volume HA-WIN-TT-1T-changelog9: type features/changelog10: option changelog-brick /exports/NFS-WIN/1T11: option changelog-dir /exports/NFS-WIN/1T/.glusterfs/changelogs12: subvolumes HA-WIN-TT-1T-posix13: end-volume14:15: volume HA-WIN-TT-1T-access-control16: type features/access-control17: subvolumes HA-WIN-TT-1T-changelog18: end-volume19:20: volume HA-WIN-TT-1T-locks21: type features/locks22: subvolumes HA-WIN-TT-1T-access-control23: end-volume24:25: volume HA-WIN-TT-1T-io-threads26: type performance/io-threads27: subvolumes HA-WIN-TT-1T-locks28: end-volume29:30: volume HA-WIN-TT-1T-index31: type features/index32: option index-base /exports/NFS-WIN/1T/.glusterfs/indices33: subvolumes HA-WIN-TT-1T-io-threads34: end-volume35:36: volume HA-WIN-TT-1T-marker37: type features/marker38: option volume-uuid 2937ac01-4cba-44a8-8ff8-0161b67f8ee439: option timestamp-file /var/lib/glusterd/vols/HA-WIN-TT-1T/marker.tstamp40: option xtime off41: option gsync-force-xtime off42: option quota off43: subvolumes HA-WIN-TT-1T-index44: end-volume45:46: volume HA-WIN-TT-1T-quota47: type features/quota48: option volume-uuid HA-WIN-TT-1T49: option server-quota off50: option timeout 051: option deem-statfs off52: subvolumes HA-WIN-TT-1T-marker53: end-volume54:55: volume /exports/NFS-WIN/1T56: type debug/io-stats57: option latency-measurement off58: option count-fop-hits off59: subvolumes HA-WIN-TT-1T-quota60: end-volume61:62: volume HA-WIN-TT-1T-server63: type protocol/server64: option transport.socket.listen-port 4916065: option rpc-auth.auth-glusterfs on66: option rpc-auth.auth-unix on67: option rpc-auth.auth-null on68: option transport-type tcp69: option auth.login./exports/NFS-WIN/1T.allow 101b907c-ff21-47da-8ba6-37e2920691ce70: option auth.login.101b907c-ff21-47da-8ba6-37e2920691ce.password f4f29094-891f-4241-8736-5e3302ed8bc871: option auth.addr./exports/NFS-WIN/1T.allow *72: subvolumes /exports/NFS-WIN/1T73: end-volume74:+------------------------------------------------------------------------------+[2014-10-13 17:38:25.933796] I [server-handshake.c:575:server_setvolume] 0-HA-WIN-TT-1T-server: accepted client from glstor-cli-20753-2014/10/13-11:50:40:959211-HA-WIN-TT-1T-client-0-0-1 (version: 3.5.2)[2014-10-13 17:38:26.954924] I [server-handshake.c:575:server_setvolume] 0-HA-WIN-TT-1T-server: accepted client from stor1-14362-2014/10/13-17:38:26:938194-HA-WIN-TT-1T-client-0-0-0 (version: 3.5.2)[2014-10-13 17:38:28.991488] I [server-handshake.c:575:server_setvolume] 0-HA-WIN-TT-1T-server: accepted client from stor2-15494-2014/10/13-17:38:28:989227-HA-WIN-TT-1T-client-0-0-0 (version: 3.5.2)[2014-10-13 17:38:38.502056] I [server-handshake.c:575:server_setvolume] 0-HA-WIN-TT-1T-server: accepted client from glstor-cli-23823-2014/10/13-17:37:54:595571-HA-WIN-TT-1T-client-0-0-0 (version: 3.5.2)[2014-10-13 17:39:09.858784] I [server.c:520:server_rpc_notify] 0-HA-WIN-TT-1T-server: disconnecting connectionfrom glstor-cli-20753-2014/10/13-11:50:40:959211-HA-WIN-TT-1T-client-0-0-1[2014-10-13 17:39:09.858863] I [client_t.c:417:gf_client_unref] 0-HA-WIN-TT-1T-server: Shutting down connection glstor-cli-20753-2014/10/13-11:50:40:959211-HA-WIN-TT-1T-client-0-0-1[2014-10-13 17:41:05.390918] I [glusterfsd-mgmt.c:56:mgmt_cbk_spec] 0-mgmt: Volume file changed[2014-10-13 17:41:05.408236] I [glusterfsd-mgmt.c:56:mgmt_cbk_spec] 0-mgmt: Volume file changed[2014-10-13 17:41:05.414813] I [glusterfsd-mgmt.c:1307:mgmt_getspec_cbk] 0-glusterfs: No change in volfile, continuing
seems to be the right part of logs :)
2014-10-15 18:24 GMT+03:00 Pranith Kumar Karampuri <pkarampu@xxxxxxxxxx>:
On 10/14/2014 01:20 AM, Roman wrote:
This warning says 'Read IO wait' and there is not a single READ operation that came to gluster. Wondering why that is :-/. Any clue? There is at least one write which took 3 seconds according to the stats. At least one synchronization operation (FINODELK) took 23 seconds. Could you give logs of this run? for mount, glustershd, bricks.ok. done.this time there were no disconnects, at least all of vms are working, but got some mails from VM about IO writes again.
WARNINGs: Read IO Wait time is 1.45 (outside range [0:1]).
Pranith
here is the output
root@stor1:~# gluster volume profile HA-WIN-TT-1T infoBrick: stor1:/exports/NFS-WIN/1T--------------------------------Cumulative Stats:Block Size: 131072b+ 262144b+No. of Reads: 0 0No. of Writes: 7372798 1%-latency Avg-latency Min-Latency Max-Latency No. of calls Fop--------- ----------- ----------- ----------- ------------ ----0.00 0.00 us 0.00 us 0.00 us 25 RELEASE0.00 0.00 us 0.00 us 0.00 us 16 RELEASEDIR0.00 64.00 us 52.00 us 76.00 us 2 ENTRYLK0.00 73.50 us 51.00 us 96.00 us 2 FLUSH0.00 68.43 us 30.00 us 135.00 us 7 STATFS0.00 54.31 us 44.00 us 109.00 us 16 OPENDIR0.00 50.75 us 16.00 us 74.00 us 24 FSTAT0.00 47.77 us 19.00 us 119.00 us 26 GETXATTR0.00 59.21 us 21.00 us 89.00 us 24 OPEN0.00 59.39 us 22.00 us 296.00 us 28 READDIR0.00 4972.00 us 4972.00 us 4972.00 us 1 CREATE0.00 97.42 us 19.00 us 184.00 us 62 LOOKUP0.00 89.49 us 20.00 us 656.00 us 324 FXATTROP3.91 1255944.81 us 127.00 us 23397532.00 us 189 FSYNC7.40 3406275.50 us 17.00 us 23398013.00 us 132 INODELK34.96 94598.02 us 8.00 us 23398705.00 us 22445 FINODELK53.73 442.66 us 79.00 us 3116494.00 us 7372799 WRITE
Duration: 7813 secondsData Read: 0 bytesData Written: 966367641600 bytes
Interval 0 Stats:Block Size: 131072b+ 262144b+No. of Reads: 0 0No. of Writes: 7372798 1%-latency Avg-latency Min-Latency Max-Latency No. of calls Fop--------- ----------- ----------- ----------- ------------ ----0.00 0.00 us 0.00 us 0.00 us 25 RELEASE0.00 0.00 us 0.00 us 0.00 us 16 RELEASEDIR0.00 64.00 us 52.00 us 76.00 us 2 ENTRYLK0.00 73.50 us 51.00 us 96.00 us 2 FLUSH0.00 68.43 us 30.00 us 135.00 us 7 STATFS0.00 54.31 us 44.00 us 109.00 us 16 OPENDIR0.00 50.75 us 16.00 us 74.00 us 24 FSTAT0.00 47.77 us 19.00 us 119.00 us 26 GETXATTR0.00 59.21 us 21.00 us 89.00 us 24 OPEN0.00 59.39 us 22.00 us 296.00 us 28 READDIR0.00 4972.00 us 4972.00 us 4972.00 us 1 CREATE0.00 97.42 us 19.00 us 184.00 us 62 LOOKUP0.00 89.49 us 20.00 us 656.00 us 324 FXATTROP3.91 1255944.81 us 127.00 us 23397532.00 us 189 FSYNC7.40 3406275.50 us 17.00 us 23398013.00 us 132 INODELK34.96 94598.02 us 8.00 us 23398705.00 us 22445 FINODELK53.73 442.66 us 79.00 us 3116494.00 us 7372799 WRITE
Duration: 7813 secondsData Read: 0 bytesData Written: 966367641600 bytes
Brick: stor2:/exports/NFS-WIN/1T--------------------------------Cumulative Stats:Block Size: 131072b+ 262144b+No. of Reads: 0 0No. of Writes: 7372798 1%-latency Avg-latency Min-Latency Max-Latency No. of calls Fop--------- ----------- ----------- ----------- ------------ ----0.00 0.00 us 0.00 us 0.00 us 25 RELEASE0.00 0.00 us 0.00 us 0.00 us 16 RELEASEDIR0.00 61.50 us 46.00 us 77.00 us 2 ENTRYLK0.00 82.00 us 67.00 us 97.00 us 2 FLUSH0.00 265.00 us 265.00 us 265.00 us 1 CREATE0.00 57.43 us 30.00 us 85.00 us 7 STATFS0.00 61.12 us 37.00 us 107.00 us 16 OPENDIR0.00 44.04 us 12.00 us 86.00 us 24 FSTAT0.00 41.42 us 24.00 us 96.00 us 26 GETXATTR0.00 45.93 us 24.00 us 133.00 us 28 READDIR0.00 57.17 us 25.00 us 147.00 us 24 OPEN0.00 145.28 us 31.00 us 288.00 us 32 READDIRP0.00 39.50 us 10.00 us 152.00 us 132 INODELK0.00 330.97 us 20.00 us 14280.00 us 62 LOOKUP0.00 79.06 us 19.00 us 851.00 us 430 FXATTROP0.02 29.32 us 7.00 us 28154.00 us 22568 FINODELK7.80 1313096.68 us 125.00 us 23281862.00 us 189 FSYNC92.18 397.92 us 76.00 us 1838343.00 us 7372799 WRITE
Duration: 7811 secondsData Read: 0 bytesData Written: 966367641600 bytes
Interval 0 Stats:Block Size: 131072b+ 262144b+No. of Reads: 0 0No. of Writes: 7372798 1%-latency Avg-latency Min-Latency Max-Latency No. of calls Fop--------- ----------- ----------- ----------- ------------ ----0.00 0.00 us 0.00 us 0.00 us 25 RELEASE0.00 0.00 us 0.00 us 0.00 us 16 RELEASEDIR0.00 61.50 us 46.00 us 77.00 us 2 ENTRYLK0.00 82.00 us 67.00 us 97.00 us 2 FLUSH0.00 265.00 us 265.00 us 265.00 us 1 CREATE0.00 57.43 us 30.00 us 85.00 us 7 STATFS0.00 61.12 us 37.00 us 107.00 us 16 OPENDIR0.00 44.04 us 12.00 us 86.00 us 24 FSTAT0.00 41.42 us 24.00 us 96.00 us 26 GETXATTR0.00 45.93 us 24.00 us 133.00 us 28 READDIR0.00 57.17 us 25.00 us 147.00 us 24 OPEN0.00 145.28 us 31.00 us 288.00 us 32 READDIRP0.00 39.50 us 10.00 us 152.00 us 132 INODELK0.00 330.97 us 20.00 us 14280.00 us 62 LOOKUP0.00 79.06 us 19.00 us 851.00 us 430 FXATTROP0.02 29.32 us 7.00 us 28154.00 us 22568 FINODELK7.80 1313096.68 us 125.00 us 23281862.00 us 189 FSYNC92.18 397.92 us 76.00 us 1838343.00 us 7372799 WRITE
Duration: 7811 secondsData Read: 0 bytesData Written: 966367641600 bytes
does it make something more clear?
2014-10-13 20:40 GMT+03:00 Roman <romeo.r@xxxxxxxxx>:
i think i may know what was an issue. There was an iscsitarget service runing, that was exporting this generated block device. so maybe my collegue Windows server picked it up and mountd :) I'll if it will happen again.--
2014-10-13 20:27 GMT+03:00 Roman <romeo.r@xxxxxxxxx>:
So may I restart the volume and start the test, or you need something else from this issue?--
2014-10-13 19:49 GMT+03:00 Pranith Kumar Karampuri <pkarampu@xxxxxxxxxx>:
On 10/13/2014 10:03 PM, Roman wrote:
IMO this can happen if there is an fd leak. open-fd is the only variable that can change with volume restart. How do you re-create the bug?hmm,seems like another strange issue? Seen this before. Had to restart the volume to get my empty space back.root@glstor-cli:/srv/nfs/HA-WIN-TT-1T# ls -ltotal 943718400-rw-r--r-- 1 root root 966367641600 Oct 13 16:55 diskroot@glstor-cli:/srv/nfs/HA-WIN-TT-1T# rm diskroot@glstor-cli:/srv/nfs/HA-WIN-TT-1T# df -hFilesystem Size Used Avail Use% Mounted onrootfs 282G 1.1G 266G 1% /udev 10M 0 10M 0% /devtmpfs 1.4G 228K 1.4G 1% /run/dev/disk/by-uuid/c62ee3c0-c0e5-44af-b0cd-7cb3fbcc0fba 282G 1.1G 266G 1% /tmpfs 5.0M 0 5.0M 0% /run/locktmpfs 5.2G 0 5.2G 0% /run/shmstor1:HA-WIN-TT-1T 1008G 901G 57G 95% /srv/nfs/HA-WIN-TT-1T
no file, but size is still 901G.Both servers show the same.Do I really have to restart the volume to fix that?
Pranith
2014-10-13 19:30 GMT+03:00 Roman <romeo.r@xxxxxxxxx>:
Sure.I'll let it to run for this night .--
2014-10-13 19:19 GMT+03:00 Pranith Kumar Karampuri <pkarampu@xxxxxxxxxx>:
hi Roman,
Do you think we can run this test again? this time, could you enable 'gluster volume profile <volname> start', do the same test. Provide output of 'gluster volume profile <volname> info' and logs after the test?
Pranith
On 10/13/2014 09:45 PM, Roman wrote:
Sure !
root@stor1:~# gluster volume info
Volume Name: HA-2TB-TT-Proxmox-clusterType: ReplicateVolume ID: 66e38bde-c5fa-4ce2-be6e-6b2adeaa16c2Status: StartedNumber of Bricks: 1 x 2 = 2Transport-type: tcpBricks:Brick1: stor1:/exports/HA-2TB-TT-Proxmox-cluster/2TBBrick2: stor2:/exports/HA-2TB-TT-Proxmox-cluster/2TBOptions Reconfigured:nfs.disable: 0network.ping-timeout: 10
Volume Name: HA-WIN-TT-1TType: ReplicateVolume ID: 2937ac01-4cba-44a8-8ff8-0161b67f8ee4Status: StartedNumber of Bricks: 1 x 2 = 2Transport-type: tcpBricks:Brick1: stor1:/exports/NFS-WIN/1TBrick2: stor2:/exports/NFS-WIN/1TOptions Reconfigured:nfs.disable: 1network.ping-timeout: 10
2014-10-13 19:09 GMT+03:00 Pranith Kumar Karampuri <pkarampu@xxxxxxxxxx>:
Could you give your 'gluster volume info' output?
Pranith
On 10/13/2014 09:36 PM, Roman wrote:
Hi,
I've got this kind of setup (servers run replica)
@ 10G backendgluster storage1gluster storage2gluster client1
@1g backendother gluster clients
Servers got HW RAID5 with SAS disks.
So today I've desided to create a 900GB file for iscsi target that will be located @ glusterfs separate volume, using dd (just a dummy file filled with zeros, bs=1G count 900)For the first of all the process took pretty lots of time, the writing speed was 130 MB/sec (client port was 2 gbps, servers ports were running @ 1gbps).Then it reported something like "endpoint is not connected" and all of my VMs on the other volume started to give me IO errors.Servers load was around 4,6 (total 12 cores)
Maybe it was due to timeout of 2 secs, so I've made it a big higher, 10 sec.
Also during the dd image creation time, VMs very often reported me that their disks are slow likeWARNINGs: Read IO Wait time is -0.02 (outside range [0:1]).
Is 130MB /sec is the maximum bandwidth for all of the volumes in total? That why would we need 10g backends?
HW Raid local speed is 300 MB/sec, so it should not be an issue. any ideas or mby any advices?
Maybe some1 got optimized sysctl.conf for 10G backend?
mine is pretty simple, which can be found from googling.
just to mention: those VM-s were connected using separate 1gbps intraface, which means, they should not be affected by the client with 10g backend.
logs are pretty useless, they just say this during the outage
[2014-10-13 12:09:18.392910] W [client-handshake.c:276:client_ping_cbk] 0-HA-2TB-TT-Proxmox-cluster-client-0: timer must have expired
[2014-10-13 12:10:08.389708] C [client-handshake.c:127:rpc_client_ping_timer_expired] 0-HA-2TB-TT-Proxmox-cluster-client-0: server 10.250.0.1:49159 has not responded in the last 2 seconds, disconnecting.
[2014-10-13 12:10:08.390312] W [client-handshake.c:276:client_ping_cbk] 0-HA-2TB-TT-Proxmox-cluster-client-0: timer must have expired
so I decided to set the timout a bit higher.
So it seems to me, that under high load GlusterFS is not useable? 130 MB/s is not that much to get some kind of timeouts or makeing the systme so slow, that VM-s feeling themselves bad.
Of course, after the disconnection, healing process was started, but as VM-s lost connection to both of servers, it was pretty useless, they could not run anymore. and BTW, when u load the server with such huge job (dd of 900GB), healing process goes soooooo slow :)
--
Best regards,
Roman.
_______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://supercolony.gluster.org/mailman/listinfo/gluster-users
--
Best regards,
Roman.
Best regards,
Roman.
--
Best regards,
Roman.
Best regards,
Roman.
Best regards,
Roman.
--
Best regards,
Roman.
--
Best regards,
Roman.
--
Best regards,
Roman.
Best regards,
Roman.
_______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://supercolony.gluster.org/mailman/listinfo/gluster-users