GlusterFS keeps crashing

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Morning everyone,

Hoping someone can help me out with this.  I've been running GlusterFS for awhile now and everything was great.  Now for about the last month I'm lucky if it runs for a few days without crashing and bringing all the servers down.

Here's what I can see in the logs when a failure occurs.  I see this across all three hosts in the cluster.

[2015-05-19 04:12:33.761831] C [rpc-clnt-ping.c:109:rpc_clnt_ping_timer_expired] 0-www-client-0: server x.x.x.x:49157 has not responded in
the last 42 seconds, disconnecting.
[2015-05-19 04:12:33.762269] E [rpc-clnt.c:362:saved_frames_unwind] (--> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn+0x1e0)[0x7ff0ae43c550]
 (--> /usr/lib64/libgfrpc.so.0(saved_frames_unwind+0x1e7)[0x7ff0ae211787] (--> /usr/lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7ff0ae2118
9e] (--> /usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x91)[0x7ff0ae211951] (--> /usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x15f)[0x7ff
0ae211f1f] ))))) 0-www-client-0: forced unwinding frame type(GlusterFS 3.3) op(OPENDIR(20)) called at 2015-05-19 04:11:51.000813 (xid=0x4a67)
[2015-05-19 04:12:33.762302] E [client-rpc-fops.c:2686:client3_3_opendir_cbk] 0-www-client-0: remote operation failed: Transport endpoint is n
ot connected. Path: <gfid:a1fb01c7-bc8e-4854-9760-8da8d62519bc> (a1fb01c7-bc8e-4854-9760-8da8d62519bc)
[2015-05-19 04:12:33.762436] E [rpc-clnt.c:362:saved_frames_unwind] (--> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn+0x1e0)[0x7ff0ae43c550]
 (--> /usr/lib64/libgfrpc.so.0(saved_frames_unwind+0x1e7)[0x7ff0ae211787] (--> /usr/lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7ff0ae2118
9e] (--> /usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x91)[0x7ff0ae211951] (--> /usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x15f)[0x7ff
0ae211f1f] ))))) 0-www-client-0: forced unwinding frame type(GF-DUMP) op(NULL(2)) called at 2015-05-19 04:11:51.000832 (xid=0x4a68)
[2015-05-19 04:12:33.762455] W [rpc-clnt-ping.c:154:rpc_clnt_ping_cbk] 0-www-client-0: socket disconnected
[2015-05-19 04:16:45.804515] C [rpc-clnt-ping.c:109:rpc_clnt_ping_timer_expired] 0-www-conf-client-0: server x.x.x.x:49156 has not responde
d in the last 42 seconds, disconnecting.
[2015-05-19 04:16:45.804884] E [rpc-clnt.c:362:saved_frames_unwind] (--> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn+0x1e0)[0x7ff0ae43c550]
 (--> /usr/lib64/libgfrpc.so.0(saved_frames_unwind+0x1e7)[0x7ff0ae211787] (--> /usr/lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7ff0ae2118
9e] (--> /usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x91)[0x7ff0ae211951] (--> /usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x15f)[0x7ff
0ae211f1f] ))))) 0-www-conf-client-0: forced unwinding frame type(GlusterFS 3.3) op(OPENDIR(20)) called at 2015-05-19 04:16:03.000774 (xid=0x4
a83)

Here's info about the version I'm running:

glusterfs 3.6.3 built on Apr 23 2015 16:12:23
Repository revision: git://git.gluster.com/glusterfs.git
Copyright (c) 2006-2013 Red Hat, Inc. <http://www.redhat.com/>
GlusterFS comes with ABSOLUTELY NO WARRANTY.
It is licensed to you under your choice of the GNU Lesser
General Public License, version 3 or any later version (LGPLv3
or later), or the GNU General Public License, version 2 (GPLv2),
in all cases as published by the Free Software Foundation.


Any insight would be appreciated,

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux