On 05/09/2015 04:23 PM, Kotresh Hiremath Ravishankar wrote: > Hi, > > There are few regression failures with changelog translator init being failed and a core is generated > as explained below. > > 1. Why changelog translator init failed? > > In snapshot test cases, virtual multiple peers are setup in single node, > which causes 'Address already in use' and 'port already in use' error. Hence > changelog translator failed. > > 2. Even if changelog translator failed it should not core why is the core? > > Well, the stack trace in regression run didn't help much. > I induced the error manually in local system and could trace in gdb > and is happening as below. > > There is some memory corruption in cleanup_and_exit path when translators are failed. > I suppose this could happen for any translator init failed and not only specific to > changelog. Could some look into this? > > #0 0x00007ffff6cb67e0 in pthread_spin_lock () from /lib64/libpthread.so.0 > #1 0x00007ffff7b70db5 in __gf_free (free_ptr=0x7fffe4031700) at mem-pool.c:303 > #2 0x00007ffff7b7120c in mem_put (ptr=0x7fffe403171c) at mem-pool.c:570 > #3 0x00007ffff7b43fb4 in log_buf_destroy (buf=buf@entry=0x7fffe403171c) at logging.c:357 > #4 0x00007ffff7b47001 in gf_log_flush_list (copy=copy@entry=0x7fffeb80aa50, ctx=ctx@entry=0x614010) at logging.c:1711 > #5 0x00007ffff7b4720d in gf_log_flush_extra_msgs (new=0, ctx=0x614010) at logging.c:1777 > #6 gf_log_set_log_buf_size (buf_size=buf_size@entry=0) at logging.c:270 > #7 0x00007ffff7b47267 in gf_log_disable_suppression_before_exit (ctx=0x614010) at logging.c:437 > #8 0x00000000004080ec in cleanup_and_exit (signum=signum@entry=0) at glusterfsd.c:1217 > #9 0x0000000000408a16 in glusterfs_process_volfp (ctx=ctx@entry=0x614010, fp=fp@entry=0x7fffe40014f0) at glusterfsd.c:2183 > #10 0x000000000040ccf7 in mgmt_getspec_cbk (req=<optimized out>, iov=<optimized out>, count=<optimized out>, myframe=0x7fffe4000fa4) at glusterfsd-mgmt.c:1560 > #11 0x00007ffff7915c70 in rpc_clnt_handle_reply (clnt=clnt@entry=0x66d280, pollin=pollin@entry=0x7fffe4002540) at rpc-clnt.c:766 > #12 0x00007ffff7915ee4 in rpc_clnt_notify (trans=<optimized out>, mydata=0x66d2b0, event=<optimized out>, data=0x7fffe4002540) at rpc-clnt.c:894 > #13 0x00007ffff79121f3 in rpc_transport_notify (this=this@entry=0x66d6f0, event=event@entry=RPC_TRANSPORT_MSG_RECEIVED, data=data@entry=0x7fffe4002540) > at rpc-transport.c:543 > #14 0x00007fffed2ca1f4 in socket_event_poll_in (this=this@entry=0x66d6f0) at socket.c:2290 > #15 0x00007fffed2ccfb4 in socket_event_handler (fd=fd@entry=8, idx=idx@entry=1, data=0x66d6f0, poll_in=1, poll_out=0, poll_err=0) at socket.c:2403 > #16 0x00007ffff7b9aaba in event_dispatch_epoll_handler (event=0x7fffeb80ae90, event_pool=0x632c80) at event-epoll.c:572 > #17 event_dispatch_epoll_worker (data=0x66e8b0) at event-epoll.c:674 > #18 0x00007ffff6cb1ee5 in start_thread () from /lib64/libpthread.so.0 > #19 0x00007ffff65f8b8d in clone () from /lib64/libc.so.6 Probably another candidate for http://review.gluster.org/#/c/10417/ to go in? ~Atin > > Thanks and Regards, > Kotresh H R > > ----- Original Message ----- >> From: "Kotresh Hiremath Ravishankar" <khiremat@xxxxxxxxxx> >> To: "Vijay Bellur" <vbellur@xxxxxxxxxx> >> Cc: "Gluster Devel" <gluster-devel@xxxxxxxxxxx> >> Sent: Saturday, May 9, 2015 1:06:07 PM >> Subject: Re: regression: brick crashed because of changelog xlator init failure >> >> It is crashing in libgcc!!! >> >> Program terminated with signal 11, Segmentation fault. >> #0 0x00007ff5555a1867 in ?? () from ./lib64/libgcc_s.so.1 >> Missing separate debuginfos, use: debuginfo-install >> glibc-2.12-1.149.el6_6.7.x86_64 keyutils-libs-1.4-5.el6.x86_64 >> krb5-libs-1.10.3-37.el6_6.x86_64 libcom_err-1.41.12-21.el6.x86_64 >> libgcc-4.4.7-11.el6.x86_64 libselinux-2.0.94-5.8.el6.x86_64 >> openssl-1.0.1e-30.el6.8.x86_64 zlib-1.2.3-29.el6.x86_64 >> (gdb) bt >> #0 0x00007ff5555a1867 in ?? () from ./lib64/libgcc_s.so.1 >> #1 0x00007ff5555a2119 in _Unwind_Backtrace () from ./lib64/libgcc_s.so.1 >> #2 0x00007ff56170b8f6 in backtrace () from ./lib64/libc.so.6 >> #3 0x00007ff562826544 in _gf_msg_backtrace_nomem (level=GF_LOG_ALERT, >> stacksize=200) >> at >> /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/libglusterfs/src/logging.c:1097 >> #4 0x00007ff562845b82 in gf_print_trace (signum=11, ctx=0xabc010) >> at >> /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/libglusterfs/src/common-utils.c:618 >> #5 0x0000000000409646 in glusterfsd_print_trace (signum=11) at >> /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/glusterfsd/src/glusterfsd.c:2007 >> #6 <signal handler called> >> #7 0x00007ff554484fa9 in ?? () >> #8 0x00007ff561d8b9d1 in start_thread () from ./lib64/libpthread.so.0 >> #9 0x00007ff5616f58fd in clone () from ./lib64/libc.so.6 >> >> >> Thanks and Regards, >> Kotresh H R >> >> ----- Original Message ----- >>> From: "Vijay Bellur" <vbellur@xxxxxxxxxx> >>> To: "Kotresh Hiremath Ravishankar" <khiremat@xxxxxxxxxx>, "Pranith Kumar >>> Karampuri" <pkarampu@xxxxxxxxxx> >>> Cc: "Gluster Devel" <gluster-devel@xxxxxxxxxxx> >>> Sent: Saturday, May 9, 2015 12:52:33 PM >>> Subject: Re: regression: brick crashed because of changelog >>> xlator init failure >>> >>> On 05/09/2015 12:49 PM, Kotresh Hiremath Ravishankar wrote: >>>> If you observe the logs below. Socket binding failed because of Address >>>> and >>>> port already in use ERROR. >>>> Because of that changelog failed to initiate rpc server, hence failed. >>>> Not sure why socket binding failed in this machine. >>>> >>>> [2015-05-08 21:34:47.747059] E [socket.c:823:__socket_server_bind] >>>> 0-socket.patchy-changelog: binding to failed: Address already in use >>>> [2015-05-08 21:34:47.747078] E [socket.c:826:__socket_server_bind] >>>> 0-socket.patchy-changelog: Port is already in use >>>> [2015-05-08 21:34:47.747096] W [rpcsvc.c:1602:rpcsvc_transport_create] >>>> 0-rpc-service: listening on transport failed >>>> [2015-05-08 21:34:47.747197] I [mem-pool.c:587:mem_pool_destroy] >>>> 0-patchy-changelog: size=116 max=0 total=0 >>>> [2015-05-08 21:34:47.750460] E [xlator.c:426:xlator_init] >>>> 0-patchy-changelog: Initialization of volume 'patchy-changelog' failed, >>>> review your volfile again >>>> [2015-05-08 21:34:47.750485] E [graph.c:322:glusterfs_graph_init] >>>> 0-patchy-changelog: initializing translator failed >>>> [2015-05-08 21:34:47.750497] E [graph.c:661:glusterfs_graph_activate] >>>> 0-graph: init failed >>>> [2015-05-08 21:34:47.749020] I >>>> [event-epoll.c:629:event_dispatch_epoll_worker] 0-epoll: Started thread >>>> with index 2 >>> >>> Irrespective of a socket bind failing, we should not crash. any ideas >>> why glusterfsd crashed? >>> >>> -Vijay >>> >>> >>> >> _______________________________________________ >> Gluster-devel mailing list >> Gluster-devel@xxxxxxxxxxx >> http://www.gluster.org/mailman/listinfo/gluster-devel >> > _______________________________________________ > Gluster-devel mailing list > Gluster-devel@xxxxxxxxxxx > http://www.gluster.org/mailman/listinfo/gluster-devel > -- ~Atin _______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-devel