I am able to reproduce a problem, which I think may be a bug, where if 1 of the 2 replica servers for a volume is down, clients are unable to mount the volume. I notice that if the replica that is down is on the same subnet as the client, the client fails to mount the volume, but if the replica that is down is on a different subnet, the client fails over properly and mounts the volume.
Here are the errors from the server that is still up when the client is unable to mount the volume when the replica on the same subnet as the client is down. Ideas? Should I open a bug?
[2015-07-01 05:43:08.428657] W [socket.c:923:__socket_keepalive] 0-socket: failed to set TCP_USER_TIMEOUT -1000 on socket 21, Invalid argument
[2015-07-01 05:43:08.428710] E [socket.c:3015:socket_connect] 0-management: Failed to set keep-alive: Invalid argument
[2015-07-01 05:43:08.429260] E [socket.c:3071:socket_connect] 0-management: connection attempt on 10.1.0.100:24007 failed, (Connection refused)
[2015-07-01 05:43:08.429362] W [socket.c:642:__socket_rwv] 0-management: writev on 10.1.0.100:24007 failed (Success)
[2015-07-01 05:43:08.429623] E [rpc-clnt.c:362:saved_frames_unwind] (--> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn+0x199)[0x7fd3d3470d59] (--> /usr/lib64/libgfrpc.so.0(saved_frames_unwind+0x1ae)[0x7fd3d323f43e] (--> /usr/lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7fd3d323f53e] (--> /usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xab)[0x7fd3d3240e6b] (--> /usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x168)[0x7fd3d3241418] ))))) 0-management: forced unwinding frame type(GLUSTERD-DUMP) op(DUMP(1)) called at 2015-07-01 05:43:08.429387 (xid=0x5f)
[2015-07-01 05:43:08.429654] E [glusterd-handshake.c:2001:__glusterd_peer_dump_version_cbk] 0-: Error through RPC layer, retry again later
[2015-07-01 05:43:08.429820] E [rpc-clnt.c:362:saved_frames_unwind] (--> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn+0x199)[0x7fd3d3470d59] (--> /usr/lib64/libgfrpc.so.0(saved_frames_unwind+0x1ae)[0x7fd3d323f43e] (--> /usr/lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7fd3d323f53e] (--> /usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xab)[0x7fd3d3240e6b] (--> /usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x168)[0x7fd3d3241418] ))))) 0-management: forced unwinding frame type(GF-DUMP) op(NULL(2)) called at 2015-07-01 05:43:08.429395 (xid=0x60)
[2015-07-01 05:43:08.429845] W [rpc-clnt-ping.c:204:rpc_clnt_ping_cbk] 0-management: socket disconnected
Here are the errors from the server that is still up when the client is unable to mount the volume when the replica on the same subnet as the client is down. Ideas? Should I open a bug?
[2015-07-01 05:43:08.428657] W [socket.c:923:__socket_keepalive] 0-socket: failed to set TCP_USER_TIMEOUT -1000 on socket 21, Invalid argument
[2015-07-01 05:43:08.428710] E [socket.c:3015:socket_connect] 0-management: Failed to set keep-alive: Invalid argument
[2015-07-01 05:43:08.429260] E [socket.c:3071:socket_connect] 0-management: connection attempt on 10.1.0.100:24007 failed, (Connection refused)
[2015-07-01 05:43:08.429362] W [socket.c:642:__socket_rwv] 0-management: writev on 10.1.0.100:24007 failed (Success)
[2015-07-01 05:43:08.429623] E [rpc-clnt.c:362:saved_frames_unwind] (--> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn+0x199)[0x7fd3d3470d59] (--> /usr/lib64/libgfrpc.so.0(saved_frames_unwind+0x1ae)[0x7fd3d323f43e] (--> /usr/lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7fd3d323f53e] (--> /usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xab)[0x7fd3d3240e6b] (--> /usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x168)[0x7fd3d3241418] ))))) 0-management: forced unwinding frame type(GLUSTERD-DUMP) op(DUMP(1)) called at 2015-07-01 05:43:08.429387 (xid=0x5f)
[2015-07-01 05:43:08.429654] E [glusterd-handshake.c:2001:__glusterd_peer_dump_version_cbk] 0-: Error through RPC layer, retry again later
[2015-07-01 05:43:08.429820] E [rpc-clnt.c:362:saved_frames_unwind] (--> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn+0x199)[0x7fd3d3470d59] (--> /usr/lib64/libgfrpc.so.0(saved_frames_unwind+0x1ae)[0x7fd3d323f43e] (--> /usr/lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7fd3d323f53e] (--> /usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xab)[0x7fd3d3240e6b] (--> /usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x168)[0x7fd3d3241418] ))))) 0-management: forced unwinding frame type(GF-DUMP) op(NULL(2)) called at 2015-07-01 05:43:08.429395 (xid=0x60)
[2015-07-01 05:43:08.429845] W [rpc-clnt-ping.c:204:rpc_clnt_ping_cbk] 0-management: socket disconnected
_______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-users