Pranith, Another similar client crash happened. Following are the glusterfs log and gdb output for your reference. pending frames: frame : type(1) op(STATFS) frame : type(1) op(STATFS) patchset: git://git.gluster.com/glusterfs.git signal received: 6 time of crash: 2013-10-28 00:41:53 configuration details: argp 1 backtrace 1 dlfcn 1 fdatasync 1 libpthread 1 llistxattr 1 setfsid 1 spinlock 1 epoll.h 1 xattr.h 1 st_atim.tv_nsec 1 package-string: glusterfs 3.3.1 /lib64/libc.so.6[0x3a0c432900] /lib64/libc.so.6(gsignal+0x35)[0x3a0c432885] /lib64/libc.so.6(abort+0x175)[0x3a0c434065] /lib64/libc.so.6[0x3a0c46f7a7] /lib64/libc.so.6[0x3a0c4750c6] /usr/lib/libglusterfs.so.0(gf_timer_call_cancel+0xb0)[0x328b42a180] /usr/lib/glusterfs/3.3.1/xlator/protocol/client.so(client_ping_cbk+0x6d)[0x7f3514afe54d] /usr/lib/libgfrpc.so.0(rpc_clnt_handle_reply+0xa5)[0x328b80f4e5] /usr/lib/libgfrpc.so.0(rpc_clnt_notify+0x120)[0x328b80fce0] /usr/lib/libgfrpc.so.0(rpc_transport_notify+0x28)[0x328b80aeb8] /usr/lib/glusterfs/3.3.1/rpc-transport/socket.so(socket_event_poll_in+0x34)[0x7f351593a764] /usr/lib/glusterfs/3.3.1/rpc-transport/socket.so(socket_event_handler+0xc7)[0x7f351593a847] /usr/lib/libglusterfs.so.0[0x328b43e464] /usr/sbin/glusterfs(main+0x58a)[0x40736a] /lib64/libc.so.6(__libc_start_main+0xfd)[0x3a0c41ecdd] /usr/sbin/glusterfs[0x4042d9] --------- (gdb) where #0 0x0000003a0c432885 in raise () from /lib64/libc.so.6 #1 0x0000003a0c434065 in abort () from /lib64/libc.so.6 #2 0x0000003a0c46f7a7 in __libc_message () from /lib64/libc.so.6 #3 0x0000003a0c4750c6 in malloc_printerr () from /lib64/libc.so.6 #4 0x000000328b42a180 in gf_timer_call_cancel (ctx=<value optimized out>, event=0x7f34f0001730) at timer.c:122 #5 0x00007f3514afe54d in client_ping_cbk (req=<value optimized out>, iov=<value optimized out>, count=<value optimized out>, myframe=0x7f3517f0751c) at client-handshake.c:285 #6 0x000000328b80f4e5 in rpc_clnt_handle_reply (clnt=0x1890aa0, pollin=0x1e7acb0) at rpc-clnt.c:786 #7 0x000000328b80fce0 in rpc_clnt_notify (trans=<value optimized out>, mydata=0x1890ad0, event=<value optimized out>, data=<value optimized out>) at rpc-clnt.c:905 #8 0x000000328b80aeb8 in rpc_transport_notify (this=<value optimized out>, event=<value optimized out>, data=<value optimized out>) at rpc-transport.c:489 #9 0x00007f351593a764 in socket_event_poll_in (this=0x18a0500) at socket.c:1677 #10 0x00007f351593a847 in socket_event_handler (fd=<value optimized out>, idx=41, data=0x18a0500, poll_in=1, poll_out=0, poll_err=<value optimized out>) at socket.c:1792 #11 0x000000328b43e464 in event_dispatch_epoll_handler (event_pool=0x930df0) at event.c:785 #12 event_dispatch_epoll (event_pool=0x930df0) at event.c:847 #13 0x000000000040736a in main (argc=<value optimized out>, argv=0x7fff829eac78) at glusterfsd.c:1689 -----Original Message----- From: Pranith Kumar Karampuri [mailto:pkarampu at redhat.com] Sent: Friday, October 25, 2013 2:00 PM To: Song Cc: John Mark Walker; gluster-users at gluster.org Subject: Re: [Gluster-devel] GlusterFS 3.3.1 client crash (signal received: 6) Thanks for this information. Let us see if we can re-create the issue in our environment. If that does not help, we shall do a detailed analysis of the code to figure this out. Pranith ----- Original Message ----- > From: "Song" <gluster at 163.com> > To: "Pranith Kumar Karampuri" <pkarampu at redhat.com> > Cc: "John Mark Walker" <johnmark at gluster.org>, > gluster-users at gluster.org > Sent: Wednesday, October 23, 2013 2:53:03 PM > Subject: RE: [Gluster-devel] GlusterFS 3.3.1 client crash (signal received: 6) > > Pranith, > > Thanks for your detail answer. > > Our workload includes CREATE/WRITE/READ/STAT/ACCESS, as well as > chmod(filepath, 0). While I don't know which kind of workload lead to > the crash. > We have analyzed the related code such as dict, lookup of cluster/afr, > lookup of protocol/client and have nothing useful information to help > locate the issues. > > Song. > > -----Original Message----- > From: Pranith Kumar Karampuri [mailto:pkarampu at redhat.com] > Sent: Tuesday, October 22, 2013 5:25 PM > To: Song > Cc: John Mark Walker; gluster-users at gluster.org > Subject: Re: [Gluster-devel] GlusterFS 3.3.1 client > crash (signal received: 6) > > Song, > The information printed in that function gf_print_trace has been useful > in the sense that we know it happens when there is a double 'memput' of > one of the data structures as part of 'lookup'. The problem is this > issue seems to be happening only in some peculiar case, which > unfortunately you are hitting every day on 1-2 clients. That is why I > was trying to figure out what the workload is. > > Let me tell you what I mean by 'workload' is. > For example: > For websites which do some kind of image manipulation. They generally > CREATE temporary information and do some transformations i.e. > READS/WRITES and then RENAME them to the actual files. > So here the work load is CREATE/READ/WRITE/RENAME intensive. > > To give you one more example: > VM image hosting(At least with the KVM images that I test generally), > On each VM image it pretty much does WRITES, READs, STATs so it is > WRITEs/STATs/READs intensive. > > I would really like to know what kind of workload happens on your > setup to figure out what is that peculiar thing that may lead to this crash. > > Pranith. > > ----- Original Message ----- > > From: "Song" <gluster at 163.com> > > To: "Song" <gluster at 163.com>, "John Mark Walker" > > <johnmark at gluster.org>, "Pranith Kumar Karampuri" > > <pkarampu at redhat.com> > > Cc: gluster-users at gluster.org > > Sent: Tuesday, October 22, 2013 1:56:48 PM > > Subject: RE: [Gluster-devel] GlusterFS 3.3.1 client crash > > (signal received: 6) > > > > To locate this issue, is it possible to print more useful > > information in backtrace? > > When client crashed, trace information was printed. Which is coded > > in function of "gf_print_trace", in common-utils.c. > > I hope that some helpful debug information would be appended in this > > function and when client crash next time, the data can help us to > > analyze the problem. > > > > Could you give me the suggestion what codes is useful? > > Thanks! > > > > -----Original Message----- > > From: gluster-users-bounces at gluster.org > > [mailto:gluster-users-bounces at gluster.org] On Behalf Of Song > > Sent: Friday, September 06, 2013 10:17 AM > > To: 'John Mark Walker'; 'Pranith Kumar Karampuri' > > Cc: gluster-users at gluster.org > > Subject: Re: [Gluster-devel] GlusterFS 3.3.1 client > > crash (signal received: 6) > > > > It's a pity I don't know how to re-create the issue. While there are > > 1-2 crashed clients in total 120 clients every day. > > > > Below is gdb result: > > > > (gdb) where > > #0 0x0000003267432885 in raise () from /lib64/libc.so.6 > > #1 0x0000003267434065 in abort () from /lib64/libc.so.6 > > #2 0x000000326746f7a7 in __libc_message () from /lib64/libc.so.6 > > #3 0x00000032674750c6 in malloc_printerr () from /lib64/libc.so.6 > > #4 0x00007fc4f2847684 in mem_put (ptr=0x7fc4b0a4c03c) at > > mem-pool.c:559 > > #5 0x00007fc4f281cc9b in dict_destroy (this=0x7fc4f12cc5cc) at > > dict.c:397 > > #6 0x00007fc4ede24c30 in afr_local_cleanup (local=0x7fc4ce68ac20, > > this=<value optimized out>) at afr-common.c:848 > > #7 0x00007fc4ede2c0f1 in afr_lookup_done (frame=0x18d5ae4, > > cookie=0x0, this=<value optimized out>, op_ret=<value optimized > > out>, op_errno=<value optimized out>, inode=0x18d5b20, > > buf=0x7fffcb83ec50, xattr=0x7fc4f12e1818, > > postparent=0x7fffcb83ebe0) at > > afr-common.c:1881 > > #8 afr_lookup_cbk (frame=0x18d5ae4, cookie=0x0, this=<value > > optimized > > out>, op_ret=<value optimized out>, op_errno=<value optimized out>, > > inode=0x18d5b20, buf=0x7fffcb83ec50, > > xattr=0x7fc4f12e1818, postparent=0x7fffcb83ebe0) at > > afr-common.c:2044 > > #9 0x00007fc4ee066550 in client3_1_lookup_cbk (req=<value optimized > > out>, iov=<value optimized out>, count=<value optimized out>, > > myframe=0x7fc4f16f390c) at client3_1-fops.c:2636 > > #10 0x00007fc4f25ff4e5 in rpc_clnt_handle_reply (clnt=0x3b5c600, > > pollin=0x6ba00f0) at rpc-clnt.c:786 > > #11 0x00007fc4f25ffce0 in rpc_clnt_notify (trans=<value optimized > > out>, mydata=0x3b5c630, event=<value optimized out>, data=<value > > optimized out>) at rpc-clnt.c:905 > > #12 0x00007fc4f25faeb8 in rpc_transport_notify (this=<value > > optimized > > out>, event=<value optimized out>, data=<value optimized out>) at > > rpc-transport.c:489 > > #13 0x00007fc4eeeb0764 in socket_event_poll_in (this=0x3b6c060) at > > socket.c:1677 > > #14 0x00007fc4eeeb0847 in socket_event_handler (fd=<value optimized > > out>, idx=265, data=0x3b6c060, poll_in=1, poll_out=0, > > out>poll_err=<value > > optimized > > out>) at socket.c:1792 > > #15 0x00007fc4f2846464 in event_dispatch_epoll_handler > > (event_pool=0x177cdf0) at event.c:785 > > #16 event_dispatch_epoll (event_pool=0x177cdf0) at event.c:847 > > #17 0x000000000040736a in main (argc=<value optimized out>, > > argv=0x7fffcb83efc8) at glusterfsd.c:1689 > > > > > > -----Original Message----- > > From: jowalker at redhat.com [mailto:jowalker at redhat.com] On Behalf Of > > John Mark Walker > > Sent: Thursday, September 05, 2013 1:06 PM > > To: Pranith Kumar Karampuri > > Cc: Song; gluster-devel at nongnu.org > > Subject: Re: [Gluster-devel] GlusterFS 3.3.1 client crash (signal received: > > 6) > > > > Posting to gluster-users. > > > > > > ----- Pranith Kumar Karampuri <pkarampu at redhat.com> wrote: > > > Song, > > > Seems like the issue is happening because of double 'memput', > > > Could you > > let us know the steps to re-create the issue? Or the load that may > > lead to this? > > > > > > Pranith > > > > > > ----- Original Message ----- > > > > From: "Song" <gluster at 163.com> > > > > To: gluster-devel at nongnu.org > > > > Sent: Thursday, September 5, 2013 8:05:57 AM > > > > Subject: [Gluster-devel] GlusterFS 3.3.1 client crash (signal > > > > received: 6) > > > > > > > > > > > > > > > > I installed GlusterFS 3.3.1 in my 24 servers, created a DHT+AFR > > > > volume and mounted it with native client. > > > > > > > > Recently, some glusterfs clients is crashed, the log is as below. > > > > > > > > > > > > > > > > The OS is 64bit CentOS6.2, kernel version: > > > > 2.6.32-220.23.1.el6.x86_64 #1 SMP Fri Jun 28 00:56:49 CST 2013 > > > > x86_64 x86_64 x86_64 GNU/Linux > > > > > > > > > > > > > > > > > > > > > > > > pending frames: > > > > > > > > frame : type(1) op(LOOKUP) > > > > > > > > frame : type(1) op(LOOKUP) > > > > > > > > frame : type(1) op(LOOKUP) > > > > > > > > > > > > > > > > patchset: git://git.gluster.com/glusterfs.git > > > > > > > > signal received: 6 > > > > > > > > time of crash: 2013-09-05 00:37:40 > > > > > > > > configuration details: > > > > > > > > argp 1 > > > > > > > > backtrace 1 > > > > > > > > dlfcn 1 > > > > > > > > fdatasync 1 > > > > > > > > libpthread 1 > > > > > > > > llistxattr 1 > > > > > > > > setfsid 1 > > > > > > > > spinlock 1 > > > > > > > > epoll.h 1 > > > > > > > > xattr.h 1 > > > > > > > > st_atim.tv_nsec 1 > > > > > > > > package-string: glusterfs 3.3.1 > > > > > > > > /lib64/libc.so.6[0x3ac0232900] > > > > > > > > /lib64/libc.so.6(gsignal+0x35)[0x3ac0232885] > > > > > > > > /lib64/libc.so.6(abort+0x175)[0x3ac0234065] > > > > > > > > /lib64/libc.so.6[0x3ac026f7a7] > > > > > > > > /lib64/libc.so.6[0x3ac02750c6] > > > > > > > > /usr/lib/libglusterfs.so.0(mem_put+0x64)[0x7f3f99c2c684] > > > > > > > > /usr/lib/glusterfs/3.3.1/xlator/cluster/replicate.so(afr_local_c > > > > le > > > > an > > > > up+0x60)[0x7f3f95209c30] > > > > > > > > /usr/lib/glusterfs/3.3.1/xlator/cluster/replicate.so(afr_lookup_ > > > > cb > > > > k+ > > > > 0x5a1)[0x7f3f952110f1] > > > > > > > > /usr/lib/glusterfs/3.3.1/xlator/protocol/client.so(client3_1_loo > > > > ku > > > > p_ > > > > cbk+0x6b0)[0x7f3f9544b550] > > > > > > > > /usr/lib/libgfrpc.so.0(rpc_clnt_handle_reply+0xa5)[0x7f3f999e44e > > > > 5] > > > > > > > > /usr/lib/libgfrpc.so.0(rpc_clnt_notify+0x120)[0x7f3f999e4ce0] > > > > > > > > /usr/lib/libgfrpc.so.0(rpc_transport_notify+0x28)[0x7f3f999dfeb8 > > > > ] > > > > > > > > /usr/lib/glusterfs/3.3.1/rpc-transport/socket.so(socket_event_po > > > > ll > > > > _i > > > > n+0x34)[0x7f3f96295764] > > > > > > > > /usr/lib/glusterfs/3.3.1/rpc-transport/socket.so(socket_event_ha > > > > nd > > > > le > > > > r+0xc7)[0x7f3f96295847] > > > > > > > > /usr/lib/libglusterfs.so.0(+0x3e464)[0x7f3f99c2b464] > > > > > > > > /usr/sbin/glusterfs(main+0x58a)[0x40736a] > > > > > > > > /lib64/libc.so.6(__libc_start_main+0xfd)[0x3ac021ecdd] > > > > > > > > /usr/sbin/glusterfs[0x4042d9] > > > > > > > > --------- > > > > > > > > > > > > > > > > Best regards. > > > > > > > > Willard Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > Gluster-devel mailing list > > > > Gluster-devel at nongnu.org > > > > https://lists.nongnu.org/mailman/listinfo/gluster-devel > > > > > > > > > > _______________________________________________ > > > Gluster-devel mailing list > > > Gluster-devel at nongnu.org > > > https://lists.nongnu.org/mailman/listinfo/gluster-devel > > > > > > _______________________________________________ > > Gluster-users mailing list > > Gluster-users at gluster.org > > http://supercolony.gluster.org/mailman/listinfo/gluster-users > > > > > > > > >