On 11/27/2015 10:52 AM, Sahina Bose wrote: > [+ gluster-users] > > On 11/26/2015 08:37 PM, paf1@xxxxxxxx wrote: >> Hello, >> can anybody help me with this timeouts ?? >> Volumes are not active yes ( bricks down ) >> >> desc. of gluster bellow ... >> >> */var/log/glusterfs/**etc-glusterfs-glusterd.vol.log* >> [2015-11-26 14:44:47.174221] I [MSGID: 106004] >> [glusterd-handler.c:5065:__glusterd_peer_rpc_notify] 0-management: >> Peer <1hp1-SAN> (<87fc7db8-aba8-41f2-a1cd-b77e83b17436>), in state >> <Peer in Cluster>, has disconnected from glusterd. >> [2015-11-26 14:44:47.174354] W >> [glusterd-locks.c:681:glusterd_mgmt_v3_unlock] >> (-->/usr/lib64/glusterfs/3.7.6/xlator/mgmt/glusterd.so(glusterd_big_locked_notify+0x4c) >> [0x7fb7039d44dc] >> -->/usr/lib64/glusterfs/3.7.6/xlator/mgmt/glusterd.so(__glusterd_peer_rpc_notify+0x162) >> [0x7fb7039de542] >> -->/usr/lib64/glusterfs/3.7.6/xlator/mgmt/glusterd.so(glusterd_mgmt_v3_unlock+0x58a) >> [0x7fb703a79b4a] ) 0-management: Lock for vol 1HP12-P1 not held >> [2015-11-26 14:44:47.174444] W >> [glusterd-locks.c:681:glusterd_mgmt_v3_unlock] >> (-->/usr/lib64/glusterfs/3.7.6/xlator/mgmt/glusterd.so(glusterd_big_locked_notify+0x4c) >> [0x7fb7039d44dc] >> -->/usr/lib64/glusterfs/3.7.6/xlator/mgmt/glusterd.so(__glusterd_peer_rpc_notify+0x162) >> [0x7fb7039de542] >> -->/usr/lib64/glusterfs/3.7.6/xlator/mgmt/glusterd.so(glusterd_mgmt_v3_unlock+0x58a) >> [0x7fb703a79b4a] ) 0-management: Lock for vol 1HP12-P3 not held >> [2015-11-26 14:44:47.174521] W >> [glusterd-locks.c:681:glusterd_mgmt_v3_unlock] >> (-->/usr/lib64/glusterfs/3.7.6/xlator/mgmt/glusterd.so(glusterd_big_locked_notify+0x4c) >> [0x7fb7039d44dc] >> -->/usr/lib64/glusterfs/3.7.6/xlator/mgmt/glusterd.so(__glusterd_peer_rpc_notify+0x162) >> [0x7fb7039de542] >> -->/usr/lib64/glusterfs/3.7.6/xlator/mgmt/glusterd.so(glusterd_mgmt_v3_unlock+0x58a) >> [0x7fb703a79b4a] ) 0-management: Lock for vol 2HP12-P1 not held >> [2015-11-26 14:44:47.174662] W >> [glusterd-locks.c:681:glusterd_mgmt_v3_unlock] >> (-->/usr/lib64/glusterfs/3.7.6/xlator/mgmt/glusterd.so(glusterd_big_locked_notify+0x4c) >> [0x7fb7039d44dc] >> -->/usr/lib64/glusterfs/3.7.6/xlator/mgmt/glusterd.so(__glusterd_peer_rpc_notify+0x162) >> [0x7fb7039de542] >> -->/usr/lib64/glusterfs/3.7.6/xlator/mgmt/glusterd.so(glusterd_mgmt_v3_unlock+0x58a) >> [0x7fb703a79b4a] ) 0-management: Lock for vol 2HP12-P3 not held >> [2015-11-26 14:44:47.174532] W [MSGID: 106118] >> [glusterd-handler.c:5087:__glusterd_peer_rpc_notify] 0-management: >> Lock not released for 2HP12-P1 >> [2015-11-26 14:44:47.174675] W [MSGID: 106118] >> [glusterd-handler.c:5087:__glusterd_peer_rpc_notify] 0-management: >> Lock not released for 2HP12-P3 >> [2015-11-26 14:44:49.423334] I [MSGID: 106488] >> [glusterd-handler.c:1472:__glusterd_handle_cli_get_volume] 0-glusterd: >> Received get vol req >> The message "I [MSGID: 106488] >> [glusterd-handler.c:1472:__glusterd_handle_cli_get_volume] 0-glusterd: >> Received get vol req" repeated 4 times between [2015-11-26 >> 14:44:49.423334] and [2015-11-26 14:44:49.429781] >> [2015-11-26 14:44:51.148711] I [MSGID: 106163] >> [glusterd-handshake.c:1193:__glusterd_mgmt_hndsk_versions_ack] >> 0-management: using the op-version 30702 >> [2015-11-26 14:44:52.177266] W [socket.c:869:__socket_keepalive] >> 0-socket: failed to set TCP_USER_TIMEOUT -1000 on socket 12, Invalid >> argument >> [2015-11-26 14:44:52.177291] E [socket.c:2965:socket_connect] >> 0-management: Failed to set keep-alive: Invalid argument >> [2015-11-26 14:44:53.180426] W [socket.c:869:__socket_keepalive] >> 0-socket: failed to set TCP_USER_TIMEOUT -1000 on socket 17, Invalid >> argument >> [2015-11-26 14:44:53.180447] E [socket.c:2965:socket_connect] >> 0-management: Failed to set keep-alive: Invalid argument >> [2015-11-26 14:44:52.395468] I [MSGID: 106163] >> [glusterd-handshake.c:1193:__glusterd_mgmt_hndsk_versions_ack] >> 0-management: using the op-version 30702 >> [2015-11-26 14:44:54.851958] I [MSGID: 106488] >> [glusterd-handler.c:1472:__glusterd_handle_cli_get_volume] 0-glusterd: >> Received get vol req >> [2015-11-26 14:44:57.183969] W [socket.c:869:__socket_keepalive] >> 0-socket: failed to set TCP_USER_TIMEOUT -1000 on socket 19, Invalid >> argument >> [2015-11-26 14:44:57.183990] E [socket.c:2965:socket_connect] >> 0-management: Failed to set keep-alive: Invalid argument >> >> After volumes creation all works fine ( volumes up ) , but then, after >> several reboots ( yum updates) volumes failed due timeouts . >> >> Gluster description: >> >> 4 nodes with 4 volumes replica 2 >> oVirt 3.6 - the last >> gluster 3.7.6 - the last >> vdsm 4.17.999 - from git repo >> oVirt - mgmt.nodes 172.16.0.0 >> oVirt - bricks 16.0.0.0 ( "SAN" - defined as "gluster" net) >> Network works fine, no lost packets >> >> # gluster volume status >> Staging failed on 2hp1-SAN. Please check log file for details. >> Staging failed on 1hp2-SAN. Please check log file for details. >> Staging failed on 2hp2-SAN. Please check log file for details. Looking at glusterd log from the above nodes (2hp1-SAN, 1hp2-SAN, 2hp2-SAN) will give you the exact reason of the failure. Could you attach glusterd log from any one of these nodes? >> >> # gluster volume info >> >> Volume Name: 1HP12-P1 >> Type: Replicate >> Volume ID: 6991e82c-9745-4203-9b0a-df202060f455 >> Status: Started >> Number of Bricks: 1 x 2 = 2 >> Transport-type: tcp >> Bricks: >> Brick1: 1hp1-SAN:/STORAGE/p1/G >> Brick2: 1hp2-SAN:/STORAGE/p1/G >> Options Reconfigured: >> performance.readdir-ahead: on >> >> Volume Name: 1HP12-P3 >> Type: Replicate >> Volume ID: 8bbdf0cb-f9b9-4733-8388-90487aa70b30 >> Status: Started >> Number of Bricks: 1 x 2 = 2 >> Transport-type: tcp >> Bricks: >> Brick1: 1hp1-SAN:/STORAGE/p3/G >> Brick2: 1hp2-SAN:/STORAGE/p3/G >> Options Reconfigured: >> performance.readdir-ahead: on >> >> Volume Name: 2HP12-P1 >> Type: Replicate >> Volume ID: e2cd5559-f789-4636-b06a-683e43e0d6bb >> Status: Started >> Number of Bricks: 1 x 2 = 2 >> Transport-type: tcp >> Bricks: >> Brick1: 2hp1-SAN:/STORAGE/p1/G >> Brick2: 2hp2-SAN:/STORAGE/p1/G >> Options Reconfigured: >> performance.readdir-ahead: on >> >> Volume Name: 2HP12-P3 >> Type: Replicate >> Volume ID: b5300c68-10b3-4ebe-9f29-805d3a641702 >> Status: Started >> Number of Bricks: 1 x 2 = 2 >> Transport-type: tcp >> Bricks: >> Brick1: 2hp1-SAN:/STORAGE/p3/G >> Brick2: 2hp2-SAN:/STORAGE/p3/G >> Options Reconfigured: >> performance.readdir-ahead: on >> >> regs. for any hints >> Paf1 >> >> >> _______________________________________________ >> Users mailing list >> Users@xxxxxxxxx >> http://lists.ovirt.org/mailman/listinfo/users > > > > _______________________________________________ > Gluster-users mailing list > Gluster-users@xxxxxxxxxxx > http://www.gluster.org/mailman/listinfo/gluster-users > _______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-users