Server Side AFR gets transport endpoint is not connected

James E Warner <jwarner6@xxxxxxx> · Wed, 27 Aug 2008 15:15:15 -0400

Hi,

I'm currently testing gluster to see if I can make it work for our HA
filesystem needs.  And in initial testing things seem to be very good
especially with client side AFR performing replication to our server nodes.
However, we would like to keep our client network free of replication
traffic so I set up server side afr with three storage bricks replicating
data between themselves and round robin DNS for the node failover.  The
round robin dns is working and the failover between the nodes is kind of
working, but if I pull the network cable on the currently active server
(the host that the glusterfs client is connected to) the next filesystem
operation (such as ls /mnt/glusterfs) fails with a "transport endpoint is
not connected" error.  Similarly, if I have a large copy operation in
progress the copy will exit with a failure. All of the operations after
that work fine and netstat shows that the node has failed over to the next
server in the list, but by that point I the current file system operation
has failed.  Anyway, this leads me to a few questions:

0.  Do my config files look OK or does it look like I've configured this
thing incorrectly? :)
1.  Is this the expected behavior or is this a bug?  From reading the
mailing list I had the impression that on failure the operation would be
tried on the remaining ip's that were cached in the clients list, so I was
surprised that the operation failed and I think that it is probably a bug,
but I could see an argument for how this might be considered normal
operation.

2.  If this is expected behavior is there any plan to change the behavior
in the future or is server side AFR always expected to work this way?  I've
seen references to round robin dns being an interim measure on the mailing
list, so I'm not sure if there is another translator in the works or not.
If there is something in the works is that available in the current
glusterfs 1.4 snapshot releases or is that planned for a much later
version?

3.  Can you think of any option that I might have missed that would correct
the problem and allow the currently running file operation to succeed
during a failover?

4.  Once again if this is as designed can you explain the reason that it
works this way?  As I said I really expected it to transparently failover
in much the same way that client side afr seems to, so I was surprised that
it didn't.

Since I hope that this is a bug, the configuration files and the relevant
sections of the client log are below.  I have used this configuration on
the gluster 1.3.11 version and the latest snapshot from August 27, 2008.

Client Log Snippet:
================

2008-08-27 12:53:34 D [fuse-bridge.c:839:fuse_err_cbk] glusterfs-fuse: 62:
(op_num=24) ERR => 0
2008-08-27 12:54:11 W [client-protocol.c:216:call_bail] cluster: activating
bail-out. pending frames = 1. last sent = 2008-08-27 12:52:51. last
received = 2008-08-27 12:53:34 transport-timeout = 10
2008-08-27 12:54:11 C [client-protocol.c:223:call_bail] cluster: bailing
transport
2008-08-27 12:54:11 D [socket.c:183:__socket_disconnect] cluster: shutdown
() returned 0. setting connection state to -1
2008-08-27 12:54:11 W [socket.c:93:__socket_rwv] cluster: EOF from peer
192.168.0.5:6996
2008-08-27 12:54:11 D [socket.c:568:socket_proto_state_machine] cluster:
socket read failed (Transport endpoint is not connected) in state 1
(192.168.0.5:6996)
2008-08-27 12:54:11 D [client-protocol.c:4150:protocol_client_cleanup]
cluster: cleaning up state in transport object 0x867e388
2008-08-27 12:54:11 E [client-protocol.c:4201:protocol_client_cleanup]
cluster: forced unwinding frame type(1) op(34) reply=@0x86b9318
2008-08-27 12:54:11 D [inode.c:443:__inode_create] fuse/inode: create inode
(0):
2008-08-27 12:54:11 D [inode.c:268:__inode_activate] fuse/inode: activating
inode(0), lru=5/0 active=2 purge=0
2008-08-27 12:54:11 E [socket.c:1187:socket_submit] cluster: transport not
connected to submit (priv->connected = 255)
2008-08-27 12:54:11 E [fuse-bridge.c:380:fuse_entry_cbk] glusterfs-fuse:
63: (op_num=34) / => -1 (Transport endpoint is not connected)
2008-08-27 12:54:11 D [inode.c:311:__inode_retire] fuse/inode: retiring
inode(0) lru=5/0 active=1 purge=1
2008-08-27 12:54:11 D [client-protocol.c:4123:client_protocol_reconnect]
cluster: attempting reconnect
2008-08-27 12:54:11 D [name.c:187:af_inet_client_get_remote_sockaddr]
cluster: option remote-port missing in volume cluster. Defaulting to 6996
2008-08-27 12:54:11 D [common-utils.c:250:gf_resolve_ip6] resolver:
returning ip-192.168.0.7 (port-6996) for hostname: storage.frankenlab.com
and port: 6996
2008-08-27 12:54:11 D [common-utils.c:270:gf_resolve_ip6] resolver: next
DNS query will return: ip-192.168.0.6 port-6996
2008-08-27 12:54:11 D [client-protocol.c:4681:notify] cluster: got
GF_EVENT_CHILD_UP
2008-08-27 12:54:11 D [socket.c:924:socket_connect] cluster: connect ()
called on transport already connected
2008-08-27 12:54:11 D [client-protocol.c:4063:client_setvolume_cbk]
cluster: SETVOLUME on remote-host succeeded
2008-08-27 12:54:12 D [client-protocol.c:4129:client_protocol_reconnect]
cluster: breaking reconnect chain
2008-08-27 12:54:17 D [fuse-bridge.c:352:fuse_entry_cbk] glusterfs-fuse:
64: (op_num=34) / => 1
2008-08-27 12:54:17 D [fuse-bridge.c:1640:fuse_opendir] glusterfs-fuse: 65:
OPEN /
2008-08-27 12:54:17 D [fuse-bridge.c:585:fuse_fd_cbk] glusterfs-fuse: 65:
(op_num=22) / => 0x86819b8
2008-08-27 12:54:17 D [fuse-bridge.c:352:fuse_entry_cbk] glusterfs-fuse:
66: (op_num=34) / => 1

Client Configuration File:
====================
volume cluster
  type protocol/client
  option transport-type tcp/client
  option remote-host storage.frankenlab.com
  option remote-subvolume gfs
  option transport-timeout 10
end-volume

Server Configuration File:
=====================
volume gfs-ds
  type storage/posix
  option directory /mnt/test
end-volume

volume gfs-ds-locks
  type features/posix-locks
  subvolumes gfs-ds
end-volume

### Add remote client
volume gfs-storage2-ds
  type protocol/client
  option transport-type tcp/client
  option remote-host 192.168.0.6
  option remote-subvolume gfs-ds
  option transport-timeout 10
end-volume

volume gfs-storage3-ds
  type protocol/client
  option transport-type tcp/client
  option remote-host 192.168.0.7
  option remote-subvolume gfs-ds
  option transport-timeout 10
end-volume

volume gfs-ds-afr
  type cluster/afr
  subvolumes gfs-ds-locks gfs-storage2-ds gfs-storage3-ds
end-volume

volume gfs
  type performance/io-threads
  option thread-count 1
  option cache-size 32MB
  subvolumes gfs-ds-afr
end-volume

### Add network serving capability to above brick.
volume server
  type protocol/server
  option transport-type tcp/server
  subvolumes gfs
  option auth.addr.gfs-ds-locks.allow *
  option auth.addr.gfs.allow *
end-volume

Thanks,

James Warner

Computer Sciences Corporation
Registered Office: 3170 Fairview Park Drive, Falls Church, Virginia 22042,
USA
Registered in Nevada, USA No: C-489-59

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------

This is a PRIVATE message. If you are not the intended recipient, please
delete without copying and kindly advise us by e-mail of the mistake in
delivery.
NOTE: Regardless of content, this e-mail shall not operate to bind CSC to
any order or other contract unless pursuant to explicit written agreement
or government initiative expressly permitting the use of e-mail for such
purpose.
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------