I failed to retain the Cc: gluster-devel@xxxxxxxxxx when replying, therefore I repost to the list On Fri, Mar 01, 2013 at 10:25:20AM +0530, krish wrote: > Could you check if glusterd was running on the host "hotstuff", when > the client > experiences spurious disconnects? > To confirm this when you notice the 'spurious' disconnects, try > # telnet hotstuff 24007 It is alive and answers TCP connexions (I have nto touched anything since it exhibited the error, and it seems to be able to recover on its own). here is hotstuff's glusterd log at the failure time: [2013-02-28 18:26:56.275537] W [socket.c:514:__socket_rwv] 0-management: readv failed (Connection timed out) [2013-02-28 18:26:56.275589] W [socket.c:1962:__socket_proto_state_machine] 0-management: reading from socket failed. Error (Connection timed out), peer (192.0.2.98:24007) [2013-02-28 18:27:01.584174] E [rpcsvc.c:448:rpcsvc_check_and_reply_error] 0-rpcsvc: rpc actor failed to complete successfully [2013-02-28 18:27:01.585770] I [glusterd-handshake.c:428:glusterd_mgmt_hndsk_versions_ack] 0-management: using the op-version 1 [2013-02-28 18:27:11.690760] E [rpcsvc.c:448:rpcsvc_check_and_reply_error] 0-rpcsvc: rpc actor failed to complete successfully [2013-02-28 18:27:11.700733] E [rpcsvc.c:448:rpcsvc_check_and_reply_error] 0-rpcsvc: rpc actor failed to complete successfully [2013-02-28 18:32:56.305500] E [rpcsvc.c:448:rpcsvc_check_and_reply_error] 0-rpcsvc: rpc actor failed to complete successfully [2013-02-28 18:32:56.307096] I [glusterd-handshake.c:428:glusterd_mgmt_hndsk_versions_ack] 0-management: using the op-version 1 [2013-02-28 19:21:50.328141] W [socket.c:514:__socket_rwv] 0-management: readv failed (Connection timed out) [2013-02-28 19:21:50.344508] W [socket.c:1962:__socket_proto_state_machine] 0-management: reading from socket failed. Error (Connection timed out), peer (192.0.2.98:24007) [2013-02-28 19:21:52.064518] E [rpcsvc.c:448:rpcsvc_check_and_reply_error] 0-rpcsvc: rpc actor failed to complete successfully [2013-02-28 19:21:52.066116] I [glusterd-handshake.c:428:glusterd_mgmt_hndsk_versions_ack] 0-management: using the op-version 1 [2013-03-01 02:03:13.841326] I [glusterd-handler.c:886:glusterd_handle_cli_list_friends] 0-glusterd: Received cli list req [2013-03-01 02:03:35.137422] I [glusterd-handler.c:934:glusterd_handle_cli_get_volume] 0-glusterd: Received get vol req -- Emmanuel Dreyfus http://hcpnet.free.fr/pubz manu@xxxxxxxxxx