Re: socket.c:2161:socket_connect_finish (Connection refused)

Olav Peeters <opeeters@xxxxxxxxx> · Wed, 11 Jun 2014 17:05:17 +0200

Thanks allot, Pranith!
All seems back to normal again.
Looking forward to the release of 3.5.1 !
Cheers,
Olav

On 11/06/14 09:30, Pranith Kumar Karampuri wrote:
hey
        Just do "gluster volume start <volname> force" and things 
should be back to normal

Pranith

On 06/11/2014 12:56 PM, Olav Peeters wrote:
Pranith,
how could I move all data from the two problem bricks temporarily 
until the release of 3.5.1?
Like this?
# gluster volume replace-brick VOLNAME BRICK NEW-BRICK start
Will this work if the bricks are offline?
Or is there some other way to get the bricks back online manually?
Would it help to do all fuse connections via NFS until after the fix?
Cheers,
Olav

On 11/06/14 08:44, Olav Peeters wrote:
OK, thanks for the info!
Regards,
Olav

On 11/06/14 08:38, Pranith Kumar Karampuri wrote:

On 06/11/2014 12:03 PM, Olav Peeters wrote:
Thanks Pranith!

I see this at the end of the log files of one of the problem 
bricks (the first two errors are repeated several times):

[2014-06-10 09:55:28.354659] E 
[rpcsvc.c:1206:rpcsvc_submit_generic] 0-rpc-service: failed to 
submit message (XID: 0x103c59, Program: GlusterFS 3.3, ProgVers: 
330, Proc: 30) to rpc-transport (tcp.sr_vol01-server)
[2014-06-10 09:55:28.354683] E [server.c:190:server_submit_reply] 
(-->/usr/lib64/glusterfs/3.5.0/xlator/performance/io-threads.so(iot_finodelk_cbk+0xb9) 
[0x7f8c8e82f189] 
(-->/usr/lib64/glusterfs/3.5.0/xlator/debug/io-stats.so(io_stats_finodelk_cbk+0xed) 
[0x7f8c8e1f22ed] 
(-->/usr/lib64/glusterfs/3.5.0/xlator/protocol/server.so(server_finodelk_cbk+0xad) 
[0x7f8c8dfc555d]))) 0-: Reply submission failed
pending frames:
frame : type(0) op(30)
frame : type(0) op(30)
frame : type(0) op(30)
frame : type(0) op(30)
...
...

frame : type(0) op(30)
frame : type(0) op(30)
frame : type(0) op(30)

patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash: 2014-06-10 09:55:28configuration details:
argp 1
backtrace 1
dlfcn 1
fdatasync 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.5.0
/lib64/libc.so.6(+0x329a0)[0x7f8c94aac9a0]
/usr/lib64/glusterfs/3.5.0/xlator/features/locks.so(grant_blocked_inode_locks+0xc1)[0x7f8c8ea54061] 

/usr/lib64/glusterfs/3.5.0/xlator/features/locks.so(pl_inodelk_client_cleanup+0x249)[0x7f8c8ea54569] 

/usr/lib64/glusterfs/3.5.0/xlator/features/locks.so(+0x6f0a)[0x7f8c8ea49f0a] 

/usr/lib64/libglusterfs.so.0(gf_client_disconnect+0x5d)[0x7f8c964d701d] 

/usr/lib64/glusterfs/3.5.0/xlator/protocol/server.so(server_connection_cleanup+0x458)[0x7f8c8dfbda48] 

/usr/lib64/glusterfs/3.5.0/xlator/protocol/server.so(server_rpc_notify+0x183)[0x7f8c8dfb9713] 

/usr/lib64/libgfrpc.so.0(rpcsvc_handle_disconnect+0x105)[0x7f8c96261d35] 

/usr/lib64/libgfrpc.so.0(rpcsvc_notify+0x1a0)[0x7f8c96263880]
/usr/lib64/libgfrpc.so.0(rpc_transport_notify+0x28)[0x7f8c96264f98]
/usr/lib64/glusterfs/3.5.0/rpc-transport/socket.so(+0xa9a1)[0x7f8c914c39a1] 

/usr/lib64/libglusterfs.so.0(+0x672f7)[0x7f8c964d92f7]
/usr/sbin/glusterfsd(main+0x564)[0x4075e4]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x7f8c94a98d1d]
/usr/sbin/glusterfsd[0x404679]
---------

Again no info to be found online about the error.
Any idea?

This is because of bug 1089470 which is fixed for 3.5.1. Which will 
be releasing shortly.

Pranith
Olav

On 11/06/14 04:42, Pranith Kumar Karampuri wrote:
Olav,
     Check logs of the bricks to see why the bricks went down.

Pranith

On 06/11/2014 04:02 AM, Olav Peeters wrote:
Hi,
I upgraded from glusterfs 3.4 to 3.5 about 8 days ago. 
Everything was running fine until this morning. In a fuse mount 
we were having write issues. Creating and deleting files became 
an issue all of a sudden without any new changes to the cluster.

In /var/log/glusterfs/glustershd.log every couple of seconds I'm 
getting this:

[2014-06-10 22:23:52.055128] I 
[rpc-clnt.c:1685:rpc_clnt_reconfig] 0-sr_vol01-client-13: 
changing port to 49156 (from 0)
[2014-06-10 22:23:52.060153] E 
[socket.c:2161:socket_connect_finish] 0-sr_vol01-client-13: 
connection to ip-of-one-of-the-gluster-nodes:49156 failed 
(Connection refused)

# gluster volume status sr_vol01
shows that two bricks of the 18 are offline.

rebalance fails

Iptables was stopped on all nodes

If I cd into the two bricks which are offline according to the 
gluster v status, I can read/write without any problems... The 
disks are clearly fine. They are mounted, they are available.

I cannot find much info online about the error.
Does anyone have an idea what could be wrong?
How can I get the two bricks back online?

Cheers,
Olav

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-users

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-users