Gluster server crashes with signal 11 after probing peers.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi everyone.

I'm trying to add a new Gluster node to our cluster, and when trying to probing the first node in the cluster, the new node crashes with the following report (logs start when the daemon starts):

---------
[2016-03-30 20:32:05.191659] I [MSGID: 100030] [glusterfsd.c:2332:main] 0-/usr/sbin/glusterd: Started running /usr/sbin/glusterd version 3.7.9 (args: /usr/sbin/glusterd -p /var/run/glusterd.pid) [2016-03-30 20:32:05.195695] I [MSGID: 106478] [glusterd.c:1337:init] 0-management: Maximum allowed open file descriptors set to 65536 [2016-03-30 20:32:05.195752] I [MSGID: 106479] [glusterd.c:1386:init] 0-management: Using /var/lib/glusterd as working directory [2016-03-30 20:32:05.200609] W [MSGID: 103071] [rdma.c:4594:__gf_rdma_ctx_create] 0-rpc-transport/rdma: rdma_cm event channel creation failed [No such device] [2016-03-30 20:32:05.200648] W [MSGID: 103055] [rdma.c:4901:init] 0-rdma.management: Failed to initialize IB Device [2016-03-30 20:32:05.200662] W [rpc-transport.c:359:rpc_transport_load] 0-rpc-transport: 'rdma' initialization failed [2016-03-30 20:32:05.200723] W [rpcsvc.c:1597:rpcsvc_transport_create] 0-rpc-service: cannot create listener, initing the transport failed [2016-03-30 20:32:05.200743] E [MSGID: 106243] [glusterd.c:1610:init] 0-management: creation of 1 listeners failed, continuing with succeeded transport [2016-03-30 20:32:07.135310] I [MSGID: 106513] [glusterd-store.c:2062:glusterd_restore_op_version] 0-glusterd: retrieved op-version: 30501 [2016-03-30 20:32:07.135775] I [MSGID: 106498] [glusterd-handler.c:3640:glusterd_friend_add_from_peerinfo] 0-management: connect returned 0 [2016-03-30 20:32:07.135876] I [rpc-clnt.c:984:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600 [2016-03-30 20:32:07.136651] W [socket.c:870:__socket_keepalive] 0-socket: failed to set TCP_USER_TIMEOUT -1000 on socket 13, Invalid argument [2016-03-30 20:32:07.136673] E [socket.c:2966:socket_connect] 0-management: Failed to set keep-alive: Invalid argument [2016-03-30 20:32:07.136908] I [MSGID: 106194] [glusterd-store.c:3523:glusterd_store_retrieve_missed_snaps_list] 0-management: No missed snaps list.
Final graph:
+------------------------------------------------------------------------------+
  1: volume management
  2:     type mgmt/glusterd
  3:     option rpc-auth.auth-glusterfs on
  4:     option rpc-auth.auth-unix on
  5:     option rpc-auth.auth-null on
  6:     option rpc-auth-allow-insecure on
  7:     option transport.socket.listen-backlog 128
  8:     option event-threads 1
  9:     option ping-timeout 0
 10:     option transport.socket.read-fail-log off
 11:     option transport.socket.keepalive-interval 2
 12:     option transport.socket.keepalive-time 10
 13:     option transport-type rdma
 14:     option working-directory /var/lib/glusterd
 15: end-volume
 16:
+------------------------------------------------------------------------------+
[2016-03-30 20:32:07.138287] I [MSGID: 101190] [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2016-03-30 20:32:07.138980] I [MSGID: 106544] [glusterd.c:159:glusterd_uuid_init] 0-management: retrieved UUID: ae191e96-9cd6-4e2b-acae-18f2cc45e6ed [2016-03-30 20:32:07.139422] I [MSGID: 106163] [glusterd-handshake.c:1194:__glusterd_mgmt_hndsk_versions_ack] 0-management: using the op-version 30501 [2016-03-30 20:32:14.394056] I [MSGID: 106487] [glusterd-handler.c:1239:__glusterd_handle_cli_probe] 0-glusterd: Received CLI probe req nfs1 24007
pending frames:
frame : type(0) op(0)
patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash:
2016-03-30 20:32:14
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.7.9
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_msg_backtrace_nomem+0x92)[0x7f0401a78562]
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(gf_print_trace+0x31d)[0x7f0401a9464d]
/lib/x86_64-linux-gnu/libc.so.6(+0x36d40)[0x7f0400e76d40]
/lib/x86_64-linux-gnu/libpthread.so.0(pthread_spin_lock+0x0)[0x7f04012120f0]
---------


Both nodes are running GlusterFS 3.7.9 on Ubuntu Trusty Tahr (14.04 LTS). Node 1 is running Linux kernel 3.13.0-55-generic #94-Ubuntu SMP, and node 3 is running Linux kernel 3.13.0-77-generic #121-Ubuntu SMP. To me, this seems to be the only difference between the systems, although the new node has the very latest version of the Gluster packages from the launchpad.net PPA. I would imagine that Node 1 has the same update, but it's hard to tell.

Any help would be much appreciated.
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users



[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux