Hi Matt, Just checked in a fix, please see if it fixes the problem. Regarding your question of re-sync after a node has been brought down, as of now you need to manage it with rsync utility, but in future this will be automated by glusterfs itself. Thanks for notifying the bug, Krishna On 4/19/07, Matt Bennett <bennett.matthew@xxxxxxxxx> wrote:
Hello list, Using 1.3.0-pre-2.3, I have configured a two server, single client setup following [1]. Apart from the IP addresses, my config files are identical to those in the user guide. Both servers and client start and run happily. If I write to the mounted filesystem with the client, the file I created appears in /home/export on server1 and /home/afr-export on server2 (and visa versa). The problem comes when I take one of the servers down. If I take down server1, the client can no longer write to the mounted filesystem - it gives me the error "Transport endpoint is not connected", and the client logs contain this: [Apr 19 16:37:19] [ERROR/tcp-client.c:284/tcp_connect()] tcp/client:non-blocking connect() returned: 111 (Connection refused) [Apr 19 16:37:19] [ERROR/client-protocol.c:183/client_protocol_xfer()] protocol/client: client_protocol_xfer: :transport_submit failed [Apr 19 16:37:19] [DEBUG/client-protocol.c:2543/client_protocol_cleanup()] protocol/client:cleaning up state in transport object 0x8076f08 [Apr 19 16:37:19] [DEBUG/tcp-client.c:174/tcp_connect()] transport: tcp: :try_connect: socket fd = 9 [Apr 19 16:37:19] [DEBUG/tcp-client.c:196/tcp_connect()] transport: tcp: :try_connect: finalized on port `1021' [Apr 19 16:37:19] [DEBUG/tcp-client.c:255/tcp_connect()] tcp/client:connect on 9 in progress (non-blocking) [Apr 19 16:37:19] [DEBUG/tcp-client.c:293/tcp_connect()] tcp/client:connection on 9 still in progress - try later [Apr 19 16:37:19] [ERROR/client-protocol.c:183/client_protocol_xfer()] protocol/client: client_protocol_xfer: :transport_submit failed [Apr 19 16:37:19] [DEBUG/client-protocol.c:2543/client_protocol_cleanup()] protocol/client:cleaning up state in transport object 0x8076f08 [Apr 19 16:37:19] [ERROR/tcp-client.c:284/tcp_connect()] tcp/client:non-blocking connect() returned: 111 (Connection refused) [Apr 19 16:37:19] [ERROR/client-protocol.c:183/client_protocol_xfer()] protocol/client: client_protocol_xfer: :transport_submit failed [Apr 19 16:37:19] [DEBUG/client-protocol.c:2543/client_protocol_cleanup()] protocol/client:cleaning up state in transport object 0x8076f08 [Apr 19 16:37:19] [DEBUG/tcp-client.c:174/tcp_connect()] transport: tcp: :try_connect: socket fd = 9 [Apr 19 16:37:19] [DEBUG/tcp-client.c:196/tcp_connect()] transport: tcp: :try_connect: finalized on port `1021' [Apr 19 16:37:19] [DEBUG/tcp-client.c:255/tcp_connect()] tcp/client:connect on 9 in progress (non-blocking) [Apr 19 16:37:19] [DEBUG/tcp-client.c:293/tcp_connect()] tcp/client:connection on 9 still in progress - try later [Apr 19 16:37:19] [ERROR/client-protocol.c:183/client_protocol_xfer()] protocol/client: client_protocol_xfer: :transport_submit failed [Apr 19 16:37:19] [DEBUG/client-protocol.c:2543/client_protocol_cleanup()] protocol/client:cleaning up state in transport object 0x8076f08 If, however, I take down server2, then this problem doesn't occur. Writes happen as normal, but obviously don't appear in /home/afr-export on server1. This brings me on to my next question - when does the resync happen between servers? When I bring server2 back up again, files written to server1 during the downtime are not copied to server2. Is there a particular way to force the resync? Many thanks, Matt. [1] http://www.gluster.org/docs/index.php/GlusterFS_User_Guide#AFR_Example_in_Clustered_Mode _______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxx http://lists.nongnu.org/mailman/listinfo/gluster-devel