Thanks allot, Pranith!
All seems back to normal again.
Looking forward to the release of 3.5.1 !
Cheers,
Olav
On 11/06/14 09:30, Pranith Kumar Karampuri wrote:
hey
Just do "gluster volume start <volname> force" and things
should be back to normal
Pranith
On 06/11/2014 12:56 PM, Olav Peeters wrote:
Pranith,
how could I move all data from the two problem bricks temporarily
until the release of 3.5.1?
Like this?
# gluster volume replace-brick VOLNAME BRICK NEW-BRICK start
Will this work if the bricks are offline?
Or is there some other way to get the bricks back online manually?
Would it help to do all fuse connections via NFS until after the fix?
Cheers,
Olav
On 11/06/14 08:44, Olav Peeters wrote:
OK, thanks for the info!
Regards,
Olav
On 11/06/14 08:38, Pranith Kumar Karampuri wrote:
On 06/11/2014 12:03 PM, Olav Peeters wrote:
Thanks Pranith!
I see this at the end of the log files of one of the problem
bricks (the first two errors are repeated several times):
[2014-06-10 09:55:28.354659] E
[rpcsvc.c:1206:rpcsvc_submit_generic] 0-rpc-service: failed to
submit message (XID: 0x103c59, Program: GlusterFS 3.3, ProgVers:
330, Proc: 30) to rpc-transport (tcp.sr_vol01-server)
[2014-06-10 09:55:28.354683] E [server.c:190:server_submit_reply]
(-->/usr/lib64/glusterfs/3.5.0/xlator/performance/io-threads.so(iot_finodelk_cbk+0xb9)
[0x7f8c8e82f189]
(-->/usr/lib64/glusterfs/3.5.0/xlator/debug/io-stats.so(io_stats_finodelk_cbk+0xed)
[0x7f8c8e1f22ed]
(-->/usr/lib64/glusterfs/3.5.0/xlator/protocol/server.so(server_finodelk_cbk+0xad)
[0x7f8c8dfc555d]))) 0-: Reply submission failed
pending frames:
frame : type(0) op(30)
frame : type(0) op(30)
frame : type(0) op(30)
frame : type(0) op(30)
...
...
frame : type(0) op(30)
frame : type(0) op(30)
frame : type(0) op(30)
patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash: 2014-06-10 09:55:28configuration details:
argp 1
backtrace 1
dlfcn 1
fdatasync 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.5.0
/lib64/libc.so.6(+0x329a0)[0x7f8c94aac9a0]
/usr/lib64/glusterfs/3.5.0/xlator/features/locks.so(grant_blocked_inode_locks+0xc1)[0x7f8c8ea54061]
/usr/lib64/glusterfs/3.5.0/xlator/features/locks.so(pl_inodelk_client_cleanup+0x249)[0x7f8c8ea54569]
/usr/lib64/glusterfs/3.5.0/xlator/features/locks.so(+0x6f0a)[0x7f8c8ea49f0a]
/usr/lib64/libglusterfs.so.0(gf_client_disconnect+0x5d)[0x7f8c964d701d]
/usr/lib64/glusterfs/3.5.0/xlator/protocol/server.so(server_connection_cleanup+0x458)[0x7f8c8dfbda48]
/usr/lib64/glusterfs/3.5.0/xlator/protocol/server.so(server_rpc_notify+0x183)[0x7f8c8dfb9713]
/usr/lib64/libgfrpc.so.0(rpcsvc_handle_disconnect+0x105)[0x7f8c96261d35]
/usr/lib64/libgfrpc.so.0(rpcsvc_notify+0x1a0)[0x7f8c96263880]
/usr/lib64/libgfrpc.so.0(rpc_transport_notify+0x28)[0x7f8c96264f98]
/usr/lib64/glusterfs/3.5.0/rpc-transport/socket.so(+0xa9a1)[0x7f8c914c39a1]
/usr/lib64/libglusterfs.so.0(+0x672f7)[0x7f8c964d92f7]
/usr/sbin/glusterfsd(main+0x564)[0x4075e4]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x7f8c94a98d1d]
/usr/sbin/glusterfsd[0x404679]
---------
Again no info to be found online about the error.
Any idea?
This is because of bug 1089470 which is fixed for 3.5.1. Which will
be releasing shortly.
Pranith
Olav
On 11/06/14 04:42, Pranith Kumar Karampuri wrote:
Olav,
Check logs of the bricks to see why the bricks went down.
Pranith
On 06/11/2014 04:02 AM, Olav Peeters wrote:
Hi,
I upgraded from glusterfs 3.4 to 3.5 about 8 days ago.
Everything was running fine until this morning. In a fuse mount
we were having write issues. Creating and deleting files became
an issue all of a sudden without any new changes to the cluster.
In /var/log/glusterfs/glustershd.log every couple of seconds I'm
getting this:
[2014-06-10 22:23:52.055128] I
[rpc-clnt.c:1685:rpc_clnt_reconfig] 0-sr_vol01-client-13:
changing port to 49156 (from 0)
[2014-06-10 22:23:52.060153] E
[socket.c:2161:socket_connect_finish] 0-sr_vol01-client-13:
connection to ip-of-one-of-the-gluster-nodes:49156 failed
(Connection refused)
# gluster volume status sr_vol01
shows that two bricks of the 18 are offline.
rebalance fails
Iptables was stopped on all nodes
If I cd into the two bricks which are offline according to the
gluster v status, I can read/write without any problems... The
disks are clearly fine. They are mounted, they are available.
I cannot find much info online about the error.
Does anyone have an idea what could be wrong?
How can I get the two bricks back online?
Cheers,
Olav
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-users