Could you please raise a bug for this. Even I saw this some time back, will work on this with priority if you could do that. Pranith ----- Original Message ----- > From: "Song" <gluster@xxxxxxx> > To: "Song" <gluster@xxxxxxx>, "Pranith Kumar Karampuri" <pkarampu@xxxxxxxxxx> > Cc: "John Mark Walker" <johnmark@xxxxxxxxxxx>, gluster-users@xxxxxxxxxxx > Sent: Monday, December 2, 2013 2:49:26 PM > Subject: RE: [Gluster-devel] GlusterFS 3.3.1 client crash (signal received: 6) > > Pranith, > > Another kind of client crash happened, gdb information is as below for you > reference: > > Core was generated by `/usr/sbin/glusterfs --log-level=INFO --volfile-id=gfs6 > --volfile-server=bj-nx-c'. > Program terminated with signal 11, Segmentation fault. > #0 afr_frame_return (frame=<value optimized out>) at afr-common.c:983 > 983 call_count = --local->call_count; > Missing separate debuginfos, use: debuginfo-install > glibc-2.12-1.47.el6.x86_64 libgcc-4.4.6-3.el6.x86_64 > openssl-1.0.0-20.el6.x86_64 zlib-1.2.3-27.el6.x86_64 > (gdb) where > #0 afr_frame_return (frame=<value optimized out>) at afr-common.c:983 > #1 0x00007f8aa1c1ebbc in afr_sh_entry_impunge_parent_setattr_cbk > (setattr_frame=0x7f8aa525b248, cookie=<value optimized out>, this=0x1a82e00, > op_ret=<value optimized out>, > op_errno=<value optimized out>, preop=<value optimized out>, postop=0x0, > xdata=0x0) at afr-self-heal-entry.c:970 > #2 0x00007f8aa1e5fecb in client3_1_setattr (frame=0x7f8aa54ec634, > this=<value optimized out>, data=<value optimized out>) at > client3_1-fops.c:5801 > #3 0x00007f8aa1e58b41 in client_setattr (frame=0x7f8aa54ec634, this=<value > optimized out>, loc=<value optimized out>, stbuf=<value optimized out>, > valid=<value optimized out>, > xdata=<value optimized out>) at client.c:1915 > #4 0x00007f8aa1c1f080 in afr_sh_entry_impunge_setattr > (impunge_frame=0x7f8aa5454e10, this=<value optimized out>) at > afr-self-heal-entry.c:1017 > #5 0x00007f8aa1c1f5c0 in afr_sh_entry_impunge_xattrop_cbk > (impunge_frame=0x7f8aa5454e10, cookie=0x1, this=0x1a82e00, op_ret=<value > optimized out>, op_errno=22, xattr=<value optimized out>, > xdata=0x0) at afr-self-heal-entry.c:1067 > #6 0x00007f8aa1e6b34e in client3_1_xattrop_cbk (req=<value optimized out>, > iov=<value optimized out>, count=<value optimized out>, > myframe=0x7f8aa54ad5b8) at client3_1-fops.c:1715 > #7 0x00000037eba0f4e5 in rpc_clnt_handle_reply (clnt=0x1eaccd0, > pollin=0x2fba390) at rpc-clnt.c:786 > #8 0x00000037eba0fce0 in rpc_clnt_notify (trans=<value optimized out>, > mydata=0x1eacd00, event=<value optimized out>, data=<value optimized out>) > at rpc-clnt.c:905 > #9 0x00000037eba0aeb8 in rpc_transport_notify (this=<value optimized out>, > event=<value optimized out>, data=<value optimized out>) at > rpc-transport.c:489 > #10 0x00007f8aa2cb5764 in socket_event_poll_in (this=0x1ebc730) at > socket.c:1677 > #11 0x00007f8aa2cb5847 in socket_event_handler (fd=<value optimized out>, > idx=127, data=0x1ebc730, poll_in=1, poll_out=0, poll_err=<value optimized > out>) at socket.c:1792 > #12 0x00000037eb63e464 in event_dispatch_epoll_handler (event_pool=0x19eddf0) > at event.c:785 > #13 event_dispatch_epoll (event_pool=0x19eddf0) at event.c:847 > #14 0x000000000040736a in main (argc=<value optimized out>, > argv=0x7fff26cdcd78) at glusterfsd.c:1689 > > -----Original Message----- > From: Song [mailto:gluster@xxxxxxx] > Sent: Monday, October 28, 2013 11:25 AM > To: 'Pranith Kumar Karampuri' > Cc: 'John Mark Walker'; 'gluster-users@xxxxxxxxxxx' > Subject: RE: [Gluster-devel] GlusterFS 3.3.1 client crash > (signal received: 6) > > Pranith, > > Another similar client crash happened. Following are the glusterfs log and > gdb output for your reference. > > pending frames: > frame : type(1) op(STATFS) > frame : type(1) op(STATFS) > > patchset: git://git.gluster.com/glusterfs.git > signal received: 6 > time of crash: 2013-10-28 00:41:53 > configuration details: > argp 1 > backtrace 1 > dlfcn 1 > fdatasync 1 > libpthread 1 > llistxattr 1 > setfsid 1 > spinlock 1 > epoll.h 1 > xattr.h 1 > st_atim.tv_nsec 1 > package-string: glusterfs 3.3.1 > /lib64/libc.so.6[0x3a0c432900] > /lib64/libc.so.6(gsignal+0x35)[0x3a0c432885] > /lib64/libc.so.6(abort+0x175)[0x3a0c434065] > /lib64/libc.so.6[0x3a0c46f7a7] > /lib64/libc.so.6[0x3a0c4750c6] > /usr/lib/libglusterfs.so.0(gf_timer_call_cancel+0xb0)[0x328b42a180] > /usr/lib/glusterfs/3.3.1/xlator/protocol/client.so(client_ping_cbk+0x6d)[0x7f3514afe54d] > /usr/lib/libgfrpc.so.0(rpc_clnt_handle_reply+0xa5)[0x328b80f4e5] > /usr/lib/libgfrpc.so.0(rpc_clnt_notify+0x120)[0x328b80fce0] > /usr/lib/libgfrpc.so.0(rpc_transport_notify+0x28)[0x328b80aeb8] > /usr/lib/glusterfs/3.3.1/rpc-transport/socket.so(socket_event_poll_in+0x34)[0x7f351593a764] > /usr/lib/glusterfs/3.3.1/rpc-transport/socket.so(socket_event_handler+0xc7)[0x7f351593a847] > /usr/lib/libglusterfs.so.0[0x328b43e464] > /usr/sbin/glusterfs(main+0x58a)[0x40736a] > /lib64/libc.so.6(__libc_start_main+0xfd)[0x3a0c41ecdd] > /usr/sbin/glusterfs[0x4042d9] > --------- > > > (gdb) where > #0 0x0000003a0c432885 in raise () from /lib64/libc.so.6 > #1 0x0000003a0c434065 in abort () from /lib64/libc.so.6 > #2 0x0000003a0c46f7a7 in __libc_message () from /lib64/libc.so.6 > #3 0x0000003a0c4750c6 in malloc_printerr () from /lib64/libc.so.6 > #4 0x000000328b42a180 in gf_timer_call_cancel (ctx=<value optimized out>, > event=0x7f34f0001730) at timer.c:122 > #5 0x00007f3514afe54d in client_ping_cbk (req=<value optimized out>, > iov=<value optimized out>, count=<value optimized out>, > myframe=0x7f3517f0751c) at client-handshake.c:285 > #6 0x000000328b80f4e5 in rpc_clnt_handle_reply (clnt=0x1890aa0, > pollin=0x1e7acb0) at rpc-clnt.c:786 > #7 0x000000328b80fce0 in rpc_clnt_notify (trans=<value optimized out>, > mydata=0x1890ad0, event=<value optimized out>, data=<value optimized out>) > at rpc-clnt.c:905 > #8 0x000000328b80aeb8 in rpc_transport_notify (this=<value optimized out>, > event=<value optimized out>, data=<value optimized out>) at > rpc-transport.c:489 > #9 0x00007f351593a764 in socket_event_poll_in (this=0x18a0500) at > socket.c:1677 > #10 0x00007f351593a847 in socket_event_handler (fd=<value optimized out>, > idx=41, data=0x18a0500, poll_in=1, poll_out=0, poll_err=<value optimized > out>) at socket.c:1792 > #11 0x000000328b43e464 in event_dispatch_epoll_handler (event_pool=0x930df0) > at event.c:785 > #12 event_dispatch_epoll (event_pool=0x930df0) at event.c:847 > #13 0x000000000040736a in main (argc=<value optimized out>, > argv=0x7fff829eac78) at glusterfsd.c:1689 > > > > -----Original Message----- > From: Pranith Kumar Karampuri [mailto:pkarampu@xxxxxxxxxx] > Sent: Friday, October 25, 2013 2:00 PM > To: Song > Cc: John Mark Walker; gluster-users@xxxxxxxxxxx > Subject: Re: [Gluster-devel] GlusterFS 3.3.1 client crash > (signal received: 6) > > Thanks for this information. Let us see if we can re-create the issue in our > environment. If that does not help, we shall do a detailed analysis of the > code to figure this out. > > Pranith > ----- Original Message ----- > > From: "Song" <gluster@xxxxxxx> > > To: "Pranith Kumar Karampuri" <pkarampu@xxxxxxxxxx> > > Cc: "John Mark Walker" <johnmark@xxxxxxxxxxx>, > > gluster-users@xxxxxxxxxxx > > Sent: Wednesday, October 23, 2013 2:53:03 PM > > Subject: RE: [Gluster-devel] GlusterFS 3.3.1 client crash > > (signal received: 6) > > > > Pranith, > > > > Thanks for your detail answer. > > > > Our workload includes CREATE/WRITE/READ/STAT/ACCESS, as well as > > chmod(filepath, 0). While I don't know which kind of workload lead to > > the crash. > > We have analyzed the related code such as dict, lookup of cluster/afr, > > lookup of protocol/client and have nothing useful information to help > > locate the issues. > > > > Song. > > > > -----Original Message----- > > From: Pranith Kumar Karampuri [mailto:pkarampu@xxxxxxxxxx] > > Sent: Tuesday, October 22, 2013 5:25 PM > > To: Song > > Cc: John Mark Walker; gluster-users@xxxxxxxxxxx > > Subject: Re: [Gluster-devel] GlusterFS 3.3.1 client > > crash (signal received: 6) > > > > Song, > > The information printed in that function gf_print_trace has been > > useful > > in the sense that we know it happens when there is a double 'memput' > > of > > one of the data structures as part of 'lookup'. The problem is this > > issue seems to be happening only in some peculiar case, which > > unfortunately you are hitting every day on 1-2 clients. That is why I > > was trying to figure out what the workload is. > > > > Let me tell you what I mean by 'workload' is. > > For example: > > For websites which do some kind of image manipulation. They generally > > CREATE temporary information and do some transformations i.e. > > READS/WRITES and then RENAME them to the actual files. > > So here the work load is CREATE/READ/WRITE/RENAME intensive. > > > > To give you one more example: > > VM image hosting(At least with the KVM images that I test generally), > > On each VM image it pretty much does WRITES, READs, STATs so it is > > WRITEs/STATs/READs intensive. > > > > I would really like to know what kind of workload happens on your > > setup to figure out what is that peculiar thing that may lead to this > > crash. > > > > Pranith. > > > > ----- Original Message ----- > > > From: "Song" <gluster@xxxxxxx> > > > To: "Song" <gluster@xxxxxxx>, "John Mark Walker" > > > <johnmark@xxxxxxxxxxx>, "Pranith Kumar Karampuri" > > > <pkarampu@xxxxxxxxxx> > > > Cc: gluster-users@xxxxxxxxxxx > > > Sent: Tuesday, October 22, 2013 1:56:48 PM > > > Subject: RE: [Gluster-devel] GlusterFS 3.3.1 client crash > > > (signal received: 6) > > > > > > To locate this issue, is it possible to print more useful > > > information in backtrace? > > > When client crashed, trace information was printed. Which is coded > > > in function of "gf_print_trace", in common-utils.c. > > > I hope that some helpful debug information would be appended in this > > > function and when client crash next time, the data can help us to > > > analyze the problem. > > > > > > Could you give me the suggestion what codes is useful? > > > Thanks! > > > > > > -----Original Message----- > > > From: gluster-users-bounces@xxxxxxxxxxx > > > [mailto:gluster-users-bounces@xxxxxxxxxxx] On Behalf Of Song > > > Sent: Friday, September 06, 2013 10:17 AM > > > To: 'John Mark Walker'; 'Pranith Kumar Karampuri' > > > Cc: gluster-users@xxxxxxxxxxx > > > Subject: Re: [Gluster-devel] GlusterFS 3.3.1 client > > > crash (signal received: 6) > > > > > > It's a pity I don't know how to re-create the issue. While there are > > > 1-2 crashed clients in total 120 clients every day. > > > > > > Below is gdb result: > > > > > > (gdb) where > > > #0 0x0000003267432885 in raise () from /lib64/libc.so.6 > > > #1 0x0000003267434065 in abort () from /lib64/libc.so.6 > > > #2 0x000000326746f7a7 in __libc_message () from /lib64/libc.so.6 > > > #3 0x00000032674750c6 in malloc_printerr () from /lib64/libc.so.6 > > > #4 0x00007fc4f2847684 in mem_put (ptr=0x7fc4b0a4c03c) at > > > mem-pool.c:559 > > > #5 0x00007fc4f281cc9b in dict_destroy (this=0x7fc4f12cc5cc) at > > > dict.c:397 > > > #6 0x00007fc4ede24c30 in afr_local_cleanup (local=0x7fc4ce68ac20, > > > this=<value optimized out>) at afr-common.c:848 > > > #7 0x00007fc4ede2c0f1 in afr_lookup_done (frame=0x18d5ae4, > > > cookie=0x0, this=<value optimized out>, op_ret=<value optimized > > > out>, op_errno=<value optimized out>, inode=0x18d5b20, > > > buf=0x7fffcb83ec50, xattr=0x7fc4f12e1818, > > > postparent=0x7fffcb83ebe0) at > > > afr-common.c:1881 > > > #8 afr_lookup_cbk (frame=0x18d5ae4, cookie=0x0, this=<value > > > optimized > > > out>, op_ret=<value optimized out>, op_errno=<value optimized out>, > > > inode=0x18d5b20, buf=0x7fffcb83ec50, > > > xattr=0x7fc4f12e1818, postparent=0x7fffcb83ebe0) at > > > afr-common.c:2044 > > > #9 0x00007fc4ee066550 in client3_1_lookup_cbk (req=<value optimized > > > out>, iov=<value optimized out>, count=<value optimized out>, > > > myframe=0x7fc4f16f390c) at client3_1-fops.c:2636 > > > #10 0x00007fc4f25ff4e5 in rpc_clnt_handle_reply (clnt=0x3b5c600, > > > pollin=0x6ba00f0) at rpc-clnt.c:786 > > > #11 0x00007fc4f25ffce0 in rpc_clnt_notify (trans=<value optimized > > > out>, mydata=0x3b5c630, event=<value optimized out>, data=<value > > > optimized out>) at rpc-clnt.c:905 > > > #12 0x00007fc4f25faeb8 in rpc_transport_notify (this=<value > > > optimized > > > out>, event=<value optimized out>, data=<value optimized out>) at > > > rpc-transport.c:489 > > > #13 0x00007fc4eeeb0764 in socket_event_poll_in (this=0x3b6c060) at > > > socket.c:1677 > > > #14 0x00007fc4eeeb0847 in socket_event_handler (fd=<value optimized > > > out>, idx=265, data=0x3b6c060, poll_in=1, poll_out=0, > > > out>poll_err=<value > > > optimized > > > out>) at socket.c:1792 > > > #15 0x00007fc4f2846464 in event_dispatch_epoll_handler > > > (event_pool=0x177cdf0) at event.c:785 > > > #16 event_dispatch_epoll (event_pool=0x177cdf0) at event.c:847 > > > #17 0x000000000040736a in main (argc=<value optimized out>, > > > argv=0x7fffcb83efc8) at glusterfsd.c:1689 > > > > > > > > > -----Original Message----- > > > From: jowalker@xxxxxxxxxx [mailto:jowalker@xxxxxxxxxx] On Behalf Of > > > John Mark Walker > > > Sent: Thursday, September 05, 2013 1:06 PM > > > To: Pranith Kumar Karampuri > > > Cc: Song; gluster-devel@xxxxxxxxxx > > > Subject: Re: [Gluster-devel] GlusterFS 3.3.1 client crash (signal > > > received: > > > 6) > > > > > > Posting to gluster-users. > > > > > > > > > ----- Pranith Kumar Karampuri <pkarampu@xxxxxxxxxx> wrote: > > > > Song, > > > > Seems like the issue is happening because of double 'memput', > > > > Could you > > > let us know the steps to re-create the issue? Or the load that may > > > lead to this? > > > > > > > > Pranith > > > > > > > > ----- Original Message ----- > > > > > From: "Song" <gluster@xxxxxxx> > > > > > To: gluster-devel@xxxxxxxxxx > > > > > Sent: Thursday, September 5, 2013 8:05:57 AM > > > > > Subject: [Gluster-devel] GlusterFS 3.3.1 client crash (signal > > > > > received: 6) > > > > > > > > > > > > > > > > > > > > I installed GlusterFS 3.3.1 in my 24 servers, created a DHT+AFR > > > > > volume and mounted it with native client. > > > > > > > > > > Recently, some glusterfs clients is crashed, the log is as below. > > > > > > > > > > > > > > > > > > > > The OS is 64bit CentOS6.2, kernel version: > > > > > 2.6.32-220.23.1.el6.x86_64 #1 SMP Fri Jun 28 00:56:49 CST 2013 > > > > > x86_64 x86_64 x86_64 GNU/Linux > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > pending frames: > > > > > > > > > > frame : type(1) op(LOOKUP) > > > > > > > > > > frame : type(1) op(LOOKUP) > > > > > > > > > > frame : type(1) op(LOOKUP) > > > > > > > > > > > > > > > > > > > > patchset: git://git.gluster.com/glusterfs.git > > > > > > > > > > signal received: 6 > > > > > > > > > > time of crash: 2013-09-05 00:37:40 > > > > > > > > > > configuration details: > > > > > > > > > > argp 1 > > > > > > > > > > backtrace 1 > > > > > > > > > > dlfcn 1 > > > > > > > > > > fdatasync 1 > > > > > > > > > > libpthread 1 > > > > > > > > > > llistxattr 1 > > > > > > > > > > setfsid 1 > > > > > > > > > > spinlock 1 > > > > > > > > > > epoll.h 1 > > > > > > > > > > xattr.h 1 > > > > > > > > > > st_atim.tv_nsec 1 > > > > > > > > > > package-string: glusterfs 3.3.1 > > > > > > > > > > /lib64/libc.so.6[0x3ac0232900] > > > > > > > > > > /lib64/libc.so.6(gsignal+0x35)[0x3ac0232885] > > > > > > > > > > /lib64/libc.so.6(abort+0x175)[0x3ac0234065] > > > > > > > > > > /lib64/libc.so.6[0x3ac026f7a7] > > > > > > > > > > /lib64/libc.so.6[0x3ac02750c6] > > > > > > > > > > /usr/lib/libglusterfs.so.0(mem_put+0x64)[0x7f3f99c2c684] > > > > > > > > > > /usr/lib/glusterfs/3.3.1/xlator/cluster/replicate.so(afr_local_c > > > > > le > > > > > an > > > > > up+0x60)[0x7f3f95209c30] > > > > > > > > > > /usr/lib/glusterfs/3.3.1/xlator/cluster/replicate.so(afr_lookup_ > > > > > cb > > > > > k+ > > > > > 0x5a1)[0x7f3f952110f1] > > > > > > > > > > /usr/lib/glusterfs/3.3.1/xlator/protocol/client.so(client3_1_loo > > > > > ku > > > > > p_ > > > > > cbk+0x6b0)[0x7f3f9544b550] > > > > > > > > > > /usr/lib/libgfrpc.so.0(rpc_clnt_handle_reply+0xa5)[0x7f3f999e44e > > > > > 5] > > > > > > > > > > /usr/lib/libgfrpc.so.0(rpc_clnt_notify+0x120)[0x7f3f999e4ce0] > > > > > > > > > > /usr/lib/libgfrpc.so.0(rpc_transport_notify+0x28)[0x7f3f999dfeb8 > > > > > ] > > > > > > > > > > /usr/lib/glusterfs/3.3.1/rpc-transport/socket.so(socket_event_po > > > > > ll > > > > > _i > > > > > n+0x34)[0x7f3f96295764] > > > > > > > > > > /usr/lib/glusterfs/3.3.1/rpc-transport/socket.so(socket_event_ha > > > > > nd > > > > > le > > > > > r+0xc7)[0x7f3f96295847] > > > > > > > > > > /usr/lib/libglusterfs.so.0(+0x3e464)[0x7f3f99c2b464] > > > > > > > > > > /usr/sbin/glusterfs(main+0x58a)[0x40736a] > > > > > > > > > > /lib64/libc.so.6(__libc_start_main+0xfd)[0x3ac021ecdd] > > > > > > > > > > /usr/sbin/glusterfs[0x4042d9] > > > > > > > > > > --------- > > > > > > > > > > > > > > > > > > > > Best regards. > > > > > > > > > > Willard Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > Gluster-devel mailing list > > > > > Gluster-devel@xxxxxxxxxx > > > > > https://lists.nongnu.org/mailman/listinfo/gluster-devel > > > > > > > > > > > > > _______________________________________________ > > > > Gluster-devel mailing list > > > > Gluster-devel@xxxxxxxxxx > > > > https://lists.nongnu.org/mailman/listinfo/gluster-devel > > > > > > > > > _______________________________________________ > > > Gluster-users mailing list > > > Gluster-users@xxxxxxxxxxx > > > http://supercolony.gluster.org/mailman/listinfo/gluster-users > > > > > > > > > > > > > > > > > > _______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://supercolony.gluster.org/mailman/listinfo/gluster-users