Re: Problems running AFR

"Krishna Srinivas" <krishna@xxxxxxxxxxxxx> · Tue, 24 Apr 2007 00:54:47 +0530

Hi Matt,

Just checked in a fix, please see if it fixes the problem.

Regarding your question of re-sync after a node has been
brought down, as of now you need to manage it with rsync
utility, but in future this will be automated by glusterfs itself.

Thanks for notifying the bug,
Krishna

On 4/19/07, Matt Bennett <bennett.matthew@xxxxxxxxx> wrote:
Hello list,

Using 1.3.0-pre-2.3, I have configured a two server, single client
setup following [1]. Apart from the IP addresses, my config files are
identical to those in the user guide.

Both servers and client start and run happily. If I write to the
mounted filesystem with the client, the file I created appears in
/home/export on server1 and /home/afr-export on server2 (and visa
versa).

The problem comes when I take one of the servers down. If I take down
server1, the client can no longer write to the mounted filesystem - it
gives me the error "Transport endpoint is not connected", and the
client logs contain this:

[Apr 19 16:37:19] [ERROR/tcp-client.c:284/tcp_connect()]
tcp/client:non-blocking connect() returned: 111 (Connection refused)
[Apr 19 16:37:19] [ERROR/client-protocol.c:183/client_protocol_xfer()]
protocol/client: client_protocol_xfer: :transport_submit failed
[Apr 19 16:37:19]
[DEBUG/client-protocol.c:2543/client_protocol_cleanup()]
protocol/client:cleaning up state in transport object 0x8076f08
[Apr 19 16:37:19] [DEBUG/tcp-client.c:174/tcp_connect()] transport:
tcp: :try_connect: socket fd = 9
[Apr 19 16:37:19] [DEBUG/tcp-client.c:196/tcp_connect()] transport:
tcp: :try_connect: finalized on port `1021'
[Apr 19 16:37:19] [DEBUG/tcp-client.c:255/tcp_connect()]
tcp/client:connect on 9 in progress (non-blocking)
[Apr 19 16:37:19] [DEBUG/tcp-client.c:293/tcp_connect()]
tcp/client:connection on 9 still in progress - try later
[Apr 19 16:37:19] [ERROR/client-protocol.c:183/client_protocol_xfer()]
protocol/client: client_protocol_xfer: :transport_submit failed
[Apr 19 16:37:19]
[DEBUG/client-protocol.c:2543/client_protocol_cleanup()]
protocol/client:cleaning up state in transport object 0x8076f08
[Apr 19 16:37:19] [ERROR/tcp-client.c:284/tcp_connect()]
tcp/client:non-blocking connect() returned: 111 (Connection refused)
[Apr 19 16:37:19] [ERROR/client-protocol.c:183/client_protocol_xfer()]
protocol/client: client_protocol_xfer: :transport_submit failed
[Apr 19 16:37:19]
[DEBUG/client-protocol.c:2543/client_protocol_cleanup()]
protocol/client:cleaning up state in transport object 0x8076f08
[Apr 19 16:37:19] [DEBUG/tcp-client.c:174/tcp_connect()] transport:
tcp: :try_connect: socket fd = 9
[Apr 19 16:37:19] [DEBUG/tcp-client.c:196/tcp_connect()] transport:
tcp: :try_connect: finalized on port `1021'
[Apr 19 16:37:19] [DEBUG/tcp-client.c:255/tcp_connect()]
tcp/client:connect on 9 in progress (non-blocking)
[Apr 19 16:37:19] [DEBUG/tcp-client.c:293/tcp_connect()]
tcp/client:connection on 9 still in progress - try later
[Apr 19 16:37:19] [ERROR/client-protocol.c:183/client_protocol_xfer()]
protocol/client: client_protocol_xfer: :transport_submit failed
[Apr 19 16:37:19]
[DEBUG/client-protocol.c:2543/client_protocol_cleanup()]
protocol/client:cleaning up state in transport object 0x8076f08

If, however, I take down server2, then this problem doesn't occur.
Writes happen as normal, but obviously don't appear in
/home/afr-export on server1.

This brings me on to my next question - when does the resync happen
between servers? When I bring server2 back up again, files written to
server1 during the downtime are not copied to server2.

Is there a particular way to force the resync?

Many thanks,
Matt.

[1] http://www.gluster.org/docs/index.php/GlusterFS_User_Guide#AFR_Example_in_Clustered_Mode

_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxx
http://lists.nongnu.org/mailman/listinfo/gluster-devel