Re: Glusterd can't start up

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This is an issue with 3.7.1, rebalance code path in glusterd is broken.
The fix will be released in 3.7.2.

~Atin

On 06/11/2015 12:21 PM, 何亦军 wrote:
> Hi all,
> 
> My glusterfs pool updated from 3.6.2 to 3.7.1, the node server os is centos 7.1.1503 .
> some server work well , that server met glusterd start up problem. anyone can help me ?
> 
> some message below:
> 
> [root@gwgfs02 bricks]# systemctl status glusterd
> glusterd.service - GlusterFS, a clustered file-system server
>    Loaded: loaded (/usr/lib/systemd/system/glusterd.service; enabled)
>    Active: failed (Result: signal) since Thu 2015-06-11 14:37:10 CST; 3s ago
>   Process: 4166 ExecStart=/usr/sbin/glusterd -p /var/run/glusterd.pid (code=exited, status=0/SUCCESS)
> Main PID: 4167 (code=killed, signal=ABRT)
> 
> Jun 11 14:37:10 gwgfs02 etc-glusterfs-glusterd.vol[4167]: llistxattr 1
> Jun 11 14:37:10 gwgfs02 etc-glusterfs-glusterd.vol[4167]: setfsid 1
> Jun 11 14:37:10 gwgfs02 etc-glusterfs-glusterd.vol[4167]: spinlock 1
> Jun 11 14:37:10 gwgfs02 etc-glusterfs-glusterd.vol[4167]: epoll.h 1
> Jun 11 14:37:10 gwgfs02 etc-glusterfs-glusterd.vol[4167]: xattr.h 1
> Jun 11 14:37:10 gwgfs02 etc-glusterfs-glusterd.vol[4167]: st_atim.tv_nsec 1
> Jun 11 14:37:10 gwgfs02 etc-glusterfs-glusterd.vol[4167]: package-string: glusterfs 3.7.1
> Jun 11 14:37:10 gwgfs02 etc-glusterfs-glusterd.vol[4167]: ---------
> Jun 11 14:37:10 gwgfs02 systemd[1]: glusterd.service: main process exited, code=killed, status=6/ABRT
> Jun 11 14:37:10 gwgfs02 systemd[1]: Unit glusterd.service entered failed state.
> 
> some log in etc-glusterfs-glusterd.vol.log :
> [2015-06-11 06:37:10.187333] W [rdma.c:4493:__gf_rdma_ctx_create] 0-rpc-transport/rdma: rdma_cm event channel creation failed (No such device)
> [2015-06-11 06:37:10.187357] W [rdma.c:4793:init] 0-rdma.management: Failed to initialize IB Device
> [2015-06-11 06:37:10.187367] W [rpc-transport.c:358:rpc_transport_load] 0-rpc-transport: 'rdma' initialization failed
> [2015-06-11 06:37:10.187473] W [rpcsvc.c:1595:rpcsvc_transport_create] 0-rpc-service: cannot create listener, initing the transport failed
> [2015-06-11 06:37:10.187490] E [glusterd.c:1515:init] 0-management: creation of 1 listeners failed, continuing with succeeded transport
> [2015-06-11 06:37:10.188848] I [glusterd.c:413:glusterd_check_gsync_present] 0-glusterd: geo-replication module not installed in the system
> [2015-06-11 06:37:10.189361] I [glusterd-store.c:1986:glusterd_restore_op_version] 0-glusterd: retrieved op-version: 30700
> [2015-06-11 06:37:10.189475] I [glusterd.c:154:glusterd_uuid_init] 0-management: retrieved UUID: d79c0a67-155b-43a8-8b51-151cc97aa4da
> [2015-06-11 06:37:10.189557] I [rpc-clnt.c:972:rpc_clnt_connection_init] 0-glustershd: setting frame-timeout to 600
> [2015-06-11 06:37:10.189769] I [rpc-clnt.c:972:rpc_clnt_connection_init] 0-nfs: setting frame-timeout to 600
> [2015-06-11 06:37:10.189931] I [rpc-clnt.c:972:rpc_clnt_connection_init] 0-quotad: setting frame-timeout to 600
> [2015-06-11 06:37:10.190093] I [rpc-clnt.c:972:rpc_clnt_connection_init] 0-bitd: setting frame-timeout to 600
> [2015-06-11 06:37:10.190287] I [rpc-clnt.c:972:rpc_clnt_connection_init] 0-scrub: setting frame-timeout to 600
> [2015-06-11 06:37:10.190515] I [rpc-clnt.c:972:rpc_clnt_connection_init] 0-snapd: setting frame-timeout to 600
> [2015-06-11 06:37:10.467359] I [glusterd-handler.c:3387:glusterd_friend_add_from_peerinfo] 0-management: connect returned 0
> [2015-06-11 06:37:10.467437] I [glusterd-handler.c:3387:glusterd_friend_add_from_peerinfo] 0-management: connect returned 0
> [2015-06-11 06:37:10.467493] I [rpc-clnt.c:972:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600
> [2015-06-11 06:37:10.471021] W [socket.c:923:__socket_keepalive] 0-socket: failed to set TCP_USER_TIMEOUT -1000 on socket 13, Invalid argument
> [2015-06-11 06:37:10.471039] E [socket.c:3015:socket_connect] 0-management: Failed to set keep-alive: Invalid argument
> [2015-06-11 06:37:10.471159] I [rpc-clnt.c:972:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600
> [2015-06-11 06:37:10.474425] W [socket.c:923:__socket_keepalive] 0-socket: failed to set TCP_USER_TIMEOUT -1000 on socket 14, Invalid argument
> [2015-06-11 06:37:10.474442] E [socket.c:3015:socket_connect] 0-management: Failed to set keep-alive: Invalid argument
> Final graph:
> +------------------------------------------------------------------------------+
>   1: volume management
>   2:     type mgmt/glusterd
>   3:     option rpc-auth.auth-glusterfs on
>   4:     option rpc-auth.auth-unix on
>   5:     option rpc-auth.auth-null on
>   6:     option transport.socket.listen-backlog 128
>   7:     option ping-timeout 30
>   8:     option transport.socket.read-fail-log off
>   9:     option transport.socket.keepalive-interval 2
> 10:     option transport.socket.keepalive-time 10
> 11:     option transport-type rdma
> 12:     option working-directory /var/lib/glusterd
> 13: end-volume
> 14:
> +------------------------------------------------------------------------------+
> [2015-06-11 06:37:10.476457] I [event-epoll.c:629:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1
> [2015-06-11 06:37:10.553448] I [glusterd-rpc-ops.c:464:__glusterd_friend_add_cbk] 0-glusterd: Received ACC from uuid: b80f71d0-6944-4236-af96-e272a1f7e739, host: 192.168.0.61, port: 0
> [2015-06-11 06:37:10.572277] I [glusterd-handler.c:2587:__glusterd_handle_friend_update] 0-glusterd: Received friend update from uuid: b80f71d0-6944-4236-af96-e272a1f7e739
> [2015-06-11 06:37:10.572312] I [glusterd-handler.c:2630:__glusterd_handle_friend_update] 0-management: Received my uuid as Friend
> [2015-06-11 06:37:10.572628] I [MSGID: 106132] [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: nfs already stopped
> [2015-06-11 06:37:10.572673] W [socket.c:3059:socket_connect] 0-nfs: Ignore failed connection attempt on , (No such file or directory)
> [2015-06-11 06:37:10.573149] I [MSGID: 106132] [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: glustershd already stopped
> [2015-06-11 06:37:10.575894] W [socket.c:3059:socket_connect] 0-glustershd: Ignore failed connection attempt on , (No such file or directory)
> [2015-06-11 06:37:10.578510] I [MSGID: 106132] [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: quotad already stopped
> [2015-06-11 06:37:10.581415] W [socket.c:3059:socket_connect] 0-quotad: Ignore failed connection attempt on , (No such file or directory)
> [2015-06-11 06:37:10.581496] I [MSGID: 106132] [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: bitd already stopped
> [2015-06-11 06:37:10.581539] I [MSGID: 106132] [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: scrub already stopped
> [2015-06-11 06:37:10.584198] I [rpc-clnt.c:972:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600
> [2015-06-11 06:37:10.588633] I [rpc-clnt.c:972:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600
> pending frames:
> frame : type(0) op(0)
> patchset: git://git.gluster.com/glusterfs.git
> signal received: 6
> time of crash:
> 2015-06-11 06:37:10
> configuration details:
> argp 1
> backtrace 1
> dlfcn 1
> libpthread 1
> llistxattr 1
> setfsid 1
> spinlock 1
> epoll.h 1
> xattr.h 1
> st_atim.tv_nsec 1
> package-string: glusterfs 3.7.1
> /lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xb2)[0x7f15d41c0d92]
> /lib64/libglusterfs.so.0(gf_print_trace+0x32d)[0x7f15d41db9ed]
> /lib64/libc.so.6(+0x35650)[0x7f15d2bb2650]
> /lib64/libc.so.6(gsignal+0x37)[0x7f15d2bb25d7]
> /lib64/libc.so.6(abort+0x148)[0x7f15d2bb3cc8]
> /lib64/libc.so.6(+0x75e07)[0x7f15d2bf2e07]
> /lib64/libc.so.6(__fortify_fail+0x37)[0x7f15d2c8aa57]
> /lib64/libc.so.6(+0x10bc10)[0x7f15d2c88c10]
> /lib64/libc.so.6(+0x10b32b)[0x7f15d2c8832b]
> /lib64/libc.so.6(__snprintf_chk+0x78)[0x7f15d2c88248]
> /usr/lib64/glusterfs/3.7.1/xlator/mgmt/glusterd.so(glusterd_volume_defrag_restart+0x191)[0x7f15c9053931]
> /usr/lib64/glusterfs/3.7.1/xlator/mgmt/glusterd.so(glusterd_restart_rebalance+0x82)[0x7f15c9059aa2]
> /usr/lib64/glusterfs/3.7.1/xlator/mgmt/glusterd.so(glusterd_spawn_daemons+0x4f)[0x7f15c9059b1f]
> /lib64/libglusterfs.so.0(synctask_wrap+0x12)[0x7f15d41fb482]
> /lib64/libc.so.6(+0x470f0)[0x7f15d2bc40f0]
> ---------
> 
> some log in data-brick1-vol01.log
> [2015-06-11 06:37:10.602714] I [event-epoll.c:629:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1
> [2015-06-11 06:37:10.612919] W [socket.c:642:__socket_rwv] 0-glusterfs: readv on 192.168.0.62:24007 failed (Connection reset by peer)
> [2015-06-11 06:37:10.613503] E [rpc-clnt.c:362:saved_frames_unwind] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x186)[0x7f1074730ee6] (--> /lib64/libgfrpc.so.0(saved_frames_unwind+0x1de)[0x7f10744ff36e] (--> /lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f10744ff47e] (--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x9c)[0x7f1074500e0c] (--> /lib64/libgfrpc.so.0(rpc_clnt_notify+0x48)[0x7f10745015c8] ))))) 0-glusterfs: forced unwinding frame type(GlusterFS Handshake) op(GETSPEC(2)) called at 2015-06-11 06:37:10.602886 (xid=0x1)
> [2015-06-11 06:37:10.613550] E [glusterfsd-mgmt.c:1604:mgmt_getspec_cbk] 0-mgmt: failed to fetch volume file (key:vol01.gwgfs02.data-brick1-vol01)
> [2015-06-11 06:37:10.613599] W [glusterfsd.c:1219:cleanup_and_exit] (--> 0-: received signum (0), shutting down
> [2015-06-11 06:37:10.618382] I [socket.c:3358:socket_submit_request] 0-glusterfs: not connected (priv->connected = 0)
> [2015-06-11 06:37:10.618406] W [rpc-clnt.c:1566:rpc_clnt_submit] 0-glusterfs: failed to submit rpc-request (XID: 0x2 Program: Gluster Portmap, ProgVers: 1, Proc: 5) to rpc-transport (glusterfs)
> 
> 
> 
> 
> _______________________________________________
> Gluster-users mailing list
> Gluster-users@xxxxxxxxxxx
> http://www.gluster.org/mailman/listinfo/gluster-users
> 

-- 
~Atin
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users





[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux