Hello,I have a client machine that mounts as NFS a replicate x2 volume. Practicaly this is configured with automount such as:
DIR-NAME -rw,soft,intr server1,server2:/VOLUME Gluster servers are using 3.6.7. Sometimes the NFS blocks on client with server server2 not responding, timed out (here it was connected on server2)but network communication is fine beetween the two machines (they are connected to the same switch, I can ssh on each, they ping each other…).
I can also see few "xs_tcp_setup_socket: connect returned unhandled error -107" on the client.
On 'server2' side I can see in the gluster nfs logs:[2016-12-01 10:50:15.887927] W [rpcsvc.c:261:rpcsvc_program_actor] 0-rpc-service: RPC program version not available (req 100003 2) [2016-12-01 10:50:15.887965] E [rpcsvc.c:544:rpcsvc_check_and_reply_error] 0-rpcsvc: rpc actor failed to complete successfully [2016-12-01 10:50:15.901880] W [rpcsvc.c:261:rpcsvc_program_actor] 0-rpc-service: RPC program version not available (req 100003 4) [2016-12-01 10:50:15.901900] E [rpcsvc.c:544:rpcsvc_check_and_reply_error] 0-rpcsvc: rpc actor failed to complete successfully [2016-12-01 10:51:03.777145] W [rpcsvc.c:261:rpcsvc_program_actor] 0-rpc-service: RPC program version not available (req 100003 2) [2016-12-01 10:51:03.777191] E [rpcsvc.c:544:rpcsvc_check_and_reply_error] 0-rpcsvc: rpc actor failed to complete successfully [2016-12-01 10:51:03.790561] W [rpcsvc.c:261:rpcsvc_program_actor] 0-rpc-service: RPC program version not available (req 100003 4) [2016-12-01 10:51:03.790580] E [rpcsvc.c:544:rpcsvc_check_and_reply_error] 0-rpcsvc: rpc actor failed to complete successfully
at a time that correspond to the NFS timeouts.This problem occurs "often" (at least each day or each 2 days), and neither client nor servers are on heavy load (memory and CPU far to be full).
Any idea about what can be the reason and how to prevent it to occur?I reduced the autofs timeout in order to reduce impact but it is not a very nice solution… Note: I can't use the glusterfs client instead of NFS because of the memory leaks that still exist in it.
Thanks. Regards, -- Y.
Attachment:
smime.p7s
Description: Signature cryptographique S/MIME
_______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-users