Hi,
I've had the same problem as in this earlier list message
I tried to do a replace brick and it copied a percentage of the files to the new brick before hanging the volume which caused the clients to hang as well. There was a stuck glusterfsd process at 100% cpu usage even after issuing a gluster replace-brick pause command.
gluster replace-brick abort doesn't work and hangs up the management interface for 30 minutes until it times out. Even after rebooting the node I can't get abort to work, the status of the replace is paused and I'm getting the following continually logged in the brick log.
[2014-03-19 10:40:30.578834] W [dict.c:995:data_to_str] (-->/usr/lib64/glusterfs/3.3.2/rpc-transport/socket.so(socket_connect+0xab) [0x7f4ab0ee5b9b] (-->/usr/lib64/glusterfs/3.3.2/rpc-transport/socket.so(socket_client_get_remote_sockaddr+0x155) [0x7f4ab0eed1f5] (-->/usr/lib64/glusterfs/3.3.2/rpc-transport/socket.so(client_fill_address_family+0x2bb) [0x7f4ab0eed05b]))) 0-dict: data is NULL
[2014-03-19 10:40:30.578954] W [dict.c:995:data_to_str] (-->/usr/lib64/glusterfs/3.3.2/rpc-transport/socket.so(socket_connect+0xab) [0x7f4ab0ee5b9b] (-->/usr/lib64/glusterfs/3.3.2/rpc-transport/socket.so(socket_client_get_remote_sockaddr+0x155) [0x7f4ab0eed1f5] (-->/usr/lib64/glusterfs/3.3.2/rpc-transport/socket.so(client_fill_address_family+0x2c6) [0x7f4ab0eed066]))) 0-dict: data is NULL
[2014-03-19 10:42:42.607440] E [name.c:141:client_fill_address_family] 0-volname-replace-brick: transport.address-family not specified. Could not guess default value from (remote-host:(null) or transport.unix.connect-path:(null)) options
I don't if this is similar to this bug
How do I get the cluster to abort the replace brick operation
Thanks
J.
_______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://supercolony.gluster.org/mailman/listinfo/gluster-users