Re: socket.c:2161:socket_connect_finish (Connection refused)

Olav Peeters <opeeters@xxxxxxxxx> · Wed, 11 Jun 2014 08:33:56 +0200

Thanks Pranith!

I see this at the end of the log files of one of the problem bricks (the 
first two errors are repeated several times):

[2014-06-10 09:55:28.354659] E [rpcsvc.c:1206:rpcsvc_submit_generic] 
0-rpc-service: failed to submit message (XID: 0x103c59, Program: 
GlusterFS 3.3, ProgVers: 330, Proc: 30) to rpc-transport 
(tcp.sr_vol01-server)
[2014-06-10 09:55:28.354683] E [server.c:190:server_submit_reply] 
(-->/usr/lib64/glusterfs/3.5.0/xlator/performance/io-threads.so(iot_finodelk_cbk+0xb9) 
[0x7f8c8e82f189] 
(-->/usr/lib64/glusterfs/3.5.0/xlator/debug/io-stats.so(io_stats_finodelk_cbk+0xed) 
[0x7f8c8e1f22ed] 
(-->/usr/lib64/glusterfs/3.5.0/xlator/protocol/server.so(server_finodelk_cbk+0xad) 
[0x7f8c8dfc555d]))) 0-: Reply submission failed
pending frames:
frame : type(0) op(30)
frame : type(0) op(30)
frame : type(0) op(30)
frame : type(0) op(30)
...
...

frame : type(0) op(30)
frame : type(0) op(30)
frame : type(0) op(30)

patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash: 2014-06-10 09:55:28configuration details:
argp 1
backtrace 1
dlfcn 1
fdatasync 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.5.0
/lib64/libc.so.6(+0x329a0)[0x7f8c94aac9a0]
/usr/lib64/glusterfs/3.5.0/xlator/features/locks.so(grant_blocked_inode_locks+0xc1)[0x7f8c8ea54061]
/usr/lib64/glusterfs/3.5.0/xlator/features/locks.so(pl_inodelk_client_cleanup+0x249)[0x7f8c8ea54569]
/usr/lib64/glusterfs/3.5.0/xlator/features/locks.so(+0x6f0a)[0x7f8c8ea49f0a]
/usr/lib64/libglusterfs.so.0(gf_client_disconnect+0x5d)[0x7f8c964d701d]
/usr/lib64/glusterfs/3.5.0/xlator/protocol/server.so(server_connection_cleanup+0x458)[0x7f8c8dfbda48]
/usr/lib64/glusterfs/3.5.0/xlator/protocol/server.so(server_rpc_notify+0x183)[0x7f8c8dfb9713]
/usr/lib64/libgfrpc.so.0(rpcsvc_handle_disconnect+0x105)[0x7f8c96261d35]
/usr/lib64/libgfrpc.so.0(rpcsvc_notify+0x1a0)[0x7f8c96263880]
/usr/lib64/libgfrpc.so.0(rpc_transport_notify+0x28)[0x7f8c96264f98]
/usr/lib64/glusterfs/3.5.0/rpc-transport/socket.so(+0xa9a1)[0x7f8c914c39a1]
/usr/lib64/libglusterfs.so.0(+0x672f7)[0x7f8c964d92f7]
/usr/sbin/glusterfsd(main+0x564)[0x4075e4]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x7f8c94a98d1d]
/usr/sbin/glusterfsd[0x404679]
---------

Again no info to be found online about the error.
Any idea?
Olav

On 11/06/14 04:42, Pranith Kumar Karampuri wrote:
Olav,
     Check logs of the bricks to see why the bricks went down.

Pranith

On 06/11/2014 04:02 AM, Olav Peeters wrote:
Hi,
I upgraded from glusterfs 3.4 to 3.5 about 8 days ago. Everything was 
running fine until this morning. In a fuse mount we were having write 
issues. Creating and deleting files became an issue all of a sudden 
without any new changes to the cluster.

In /var/log/glusterfs/glustershd.log every couple of seconds I'm 
getting this:

[2014-06-10 22:23:52.055128] I [rpc-clnt.c:1685:rpc_clnt_reconfig] 
0-sr_vol01-client-13: changing port to 49156 (from 0)
[2014-06-10 22:23:52.060153] E [socket.c:2161:socket_connect_finish] 
0-sr_vol01-client-13: connection to 
ip-of-one-of-the-gluster-nodes:49156 failed (Connection refused)

# gluster volume status sr_vol01
shows that two bricks of the 18 are offline.

rebalance fails

Iptables was stopped on all nodes

If I cd into the two bricks which are offline according to the 
gluster v status, I can read/write without any problems... The disks 
are clearly fine. They are mounted, they are available.

I cannot find much info online about the error.
Does anyone have an idea what could be wrong?
How can I get the two bricks back online?

Cheers,
Olav

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-users

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-users