So it the patch the correct one or there is something more? Since from what I understand it's complete. -----Original Message----- From: Shehjar Tikoo [mailto:shehjart@xxxxxxxxxxx] Sent: Thursday, June 18, 2009 2:02 PM To: Raghavendra G Cc: Mihai; gluster-devel@xxxxxxxxxx Subject: Re: crash in afr Raghavendra G wrote: > While this fixes the double free, The actual fix has to copy the buffer > into an ioq_entry, instead of just storing the buffer pointer. If not, > there can be cases wherein by the time the ioq_entry is written to > socket, the buffer might've already been freed. Yup. I hadnt seen your reply to the bug report when I sent this patch. Thanks Shehjar > > On Thu, Jun 18, 2009 at 2:36 PM, Shehjar Tikoo <shehjart@xxxxxxxxxxx > <mailto:shehjart@xxxxxxxxxxx>> wrote: > > I think I understand why you see the crash. > Could you please apply the following patch and tell > us if the crash is observed still? > > Thanks > Shehjar > > > > > Mihai wrote: > > Hello, > I'm using a server side replication on 6 servers. Glusterfsd > crashes on a few hour basis: > gdb -se /usr/sbin/glusterfsd -c /core.26947 GNU gdb Fedora > (6.8-27.el5) Copyright (C) 2008 Free Software Foundation, Inc. > License GPLv3+: GNU GPL version 3 or later > <http://gnu.org/licenses/gpl.html> > This is free software: you are free to change and redistribute it. > There is NO WARRANTY, to the extent permitted by law. Type > "show copying" > and "show warranty" for details. > This GDB was configured as "x86_64-redhat-linux-gnu"... > (no debugging symbols found) > > warning: .dynamic section for "/usr/lib64/libglusterfs.so.0" is > not at the expected address > > warning: difference appears to be caused by prelink, adjusting > expectations Reading symbols from > /usr/lib64/libglusterfs.so.0...done. > Loaded symbols for /usr/lib64/libglusterfs.so.0 Reading symbols > from /lib64/libdl.so.2...done. > Loaded symbols for /lib64/libdl.so.2 > Reading symbols from /lib64/libpthread.so.0...done. > Loaded symbols for /lib64/libpthread.so.0 Reading symbols from > /lib64/libc.so.6...done. > Loaded symbols for /lib64/libc.so.6 > Reading symbols from /lib64/ld-linux-x86-64.so.2...done. > Loaded symbols for /lib64/ld-linux-x86-64.so.2 Reading symbols > from /usr/lib64/glusterfs/2.0.2/xlator/storage/posix.so...done. > Loaded symbols for > /usr/lib64/glusterfs/2.0.2/xlator/storage/posix.so > Reading symbols from > /usr/lib64/glusterfs/2.0.2/xlator/features/locks.so...done. > Loaded symbols for > /usr/lib64/glusterfs/2.0.2/xlator/features/locks.so > Reading symbols from > /usr/lib64/glusterfs/2.0.2/xlator/performance/io-threads.so...done. > Loaded symbols for > /usr/lib64/glusterfs/2.0.2/xlator/performance/io-threads.so > Reading symbols from > /usr/lib64/glusterfs/2.0.2/xlator/protocol/client.so...done. > Loaded symbols for > /usr/lib64/glusterfs/2.0.2/xlator/protocol/client.so > Reading symbols from > /usr/lib64/glusterfs/2.0.2/xlator/cluster/replicate.so...done. > Loaded symbols for > /usr/lib64/glusterfs/2.0.2/xlator/cluster/replicate.so > Reading symbols from > /usr/lib64/glusterfs/2.0.2/xlator/protocol/server.so...done. > Loaded symbols for > /usr/lib64/glusterfs/2.0.2/xlator/protocol/server.so > Reading symbols from > /usr/lib64/glusterfs/2.0.2/transport/socket.so...done. > Loaded symbols for /usr/lib64/glusterfs/2.0.2/transport/socket.so > Reading symbols from /usr/lib64/glusterfs/2.0.2/auth/addr.so...done. > Loaded symbols for /usr/lib64/glusterfs/2.0.2/auth/addr.so > Reading symbols from /lib64/libnss_files.so.2...done. > Loaded symbols for /lib64/libnss_files.so.2 Reading symbols from > /lib64/libgcc_s.so.1...done. > Loaded symbols for /lib64/libgcc_s.so.1 > Core was generated by `/usr/sbin/glusterfsd -f > /etc/glusterfs/glusterfsd.vol'. > Program terminated with signal 6, Aborted. > [New process 26947] > [New process 26956] > [New process 26955] > [New process 26954] > [New process 26953] > [New process 26952] > [New process 26951] > [New process 26950] > [New process 26949] > [New process 26948] > #0 0x0000003040030215 in raise () from /lib64/libc.so.6 > (gdb) bt > #0 0x0000003040030215 in raise () from /lib64/libc.so.6 > #1 0x0000003040031cc0 in abort () from /lib64/libc.so.6 > #2 0x000000304006a7fb in __libc_message () from /lib64/libc.so.6 > #3 0x0000003040071ce2 in _int_free () from /lib64/libc.so.6 > #4 0x000000304007590c in free () from /lib64/libc.so.6 > #5 0x00002aaaaaaadcc9 in __socket_ioq_entry_free > (entry=0x2aaab001da30) at socket.c:331 > #6 0x00002aaaaaaaf1c9 in __socket_ioq_churn_entry (this=<value > optimized out>, entry=0x2aaab001da30) at socket.c:368 > #7 0x00002aaaaaaaf8ec in socket_submit (this=0xae11a70, > buf=0x2aaab00378c0 "", len=340, vector=0x0, count=<value > optimized out>, > iobref=<value optimized out>) at socket.c:1281 > #8 0x00002b2e7c775bd3 in protocol_client_xfer > (frame=0x2aaab0030ab0, this=0xae0ab00, trans=0xae11a70, type=1, > op=40, hdr=0x2aaab00378c0, hdrlen=340, > vector=0x0, count=0, iobref=0x0) at client-protocol.c:636 > #9 0x00002b2e7c77bc1a in client_xattrop (frame=0x2aaab0030ab0, > this=0xae0ab00, loc=0x2aaab4004238, flags=GF_XATTROP_ADD_ARRAY, > dict=0x2aaab4031fc0) > at client-protocol.c:1922 > #10 0x00002b2e7c9a2cda in afr_changelog_pre_op > (frame=0x2aaab401ea70, this=0xae0b280) at afr-transaction.c:782 > #11 0x00002b2e7c9a2f31 in afr_lock_rec (frame=0x2aaab401ea70, > this=0xae0b280, child_index=1) at afr-transaction.c:979 > #12 0x00002b2e7c9a36a8 in afr_lock_cbk (frame=0x2aaab401ea70, > cookie=<value optimized out>, this=0xae0b280, op_ret=0, > op_errno=0) at afr-transaction.c:906 > #13 0x00002b2e7b6a75f0 in default_inodelk_cbk (frame=<value > optimized out>, cookie=<value optimized out>, this=<value > optimized out>, op_ret=-1, > op_errno=128) at defaults.c:1199 > #14 0x00002b2e7c358182 in pl_inodelk (frame=0x2aaab4034c10, > this=0xae08870, volume=<value optimized out>, > loc=0x2aaab4032170, cmd=7, flock=0x0) > at internal.c:194 > #15 0x00002b2e7b6a815c in default_inodelk (frame=0x2aaab4017660, > this=0xae09080, volume=0xae0b260 "replicate", > loc=0x2aaab4004238, cmd=7, > lock=0x7fff2f422f80) at defaults.c:1209 > #16 0x00002b2e7c9a33ba in afr_lock_rec (frame=0x2aaab401ea70, > this=0xae0b280, child_index=0) at afr-transaction.c:1006 > #17 0x00002b2e7c9a35c2 in afr_transaction (frame=0x2aaab401ea70, > this=0xae0b280, type=AFR_DATA_TRANSACTION) at afr-transaction.c:1170 > #18 0x00002b2e7c9a07cd in afr_truncate (frame=0x2aaab403ac30, > this=0xae0b280, loc=0x2aaab40174a0, offset=0) at > afr-inode-write.c:1224 > #19 0x00002b2e7cbc0969 in server_truncate_resume > (frame=0x2aaab403acc0, this=<value optimized out>, > loc=0x2aaab40174a0, offset=0) at server-protocol.c:4243 #20 > 0x00002b2e7b6b06f7 in call_resume (stub=0x2aaab4017470) at > call-stub.c:2384 > #21 0x00002b2e7cbc4125 in server_truncate (frame=0x2aaab403acc0, > bound_xl=<value optimized out>, hdr=<value optimized out>, > hdrlen=<value optimized out>, > iobuf=<value optimized out>) at server-protocol.c:4291 > #22 0x00002b2e7cbbfb20 in protocol_server_pollin > (this=0xae0bdf0, trans=0xae17960) at server-protocol.c:7735 > #23 0x00002b2e7cbbfbfb in notify (this=0xae0bdf0, event=<value > optimized out>, data=0x6) at server-protocol.c:7791 > #24 0x00002aaaaaaafb43 in socket_event_handler (fd=<value > optimized out>, idx=11, data=0xae17960, poll_in=1, poll_out=0, > poll_err=0) at socket.c:813 > #25 0x00002b2e7b6ba2a5 in event_dispatch_epoll > (event_pool=0xae02300) at event.c:804 > #26 0x0000000000403899 in main () > (gdb) bt full > #0 0x0000003040030215 in raise () from /lib64/libc.so.6 No > symbol table info available. > #1 0x0000003040031cc0 in abort () from /lib64/libc.so.6 No > symbol table info available. > #2 0x000000304006a7fb in __libc_message () from > /lib64/libc.so.6 No symbol table info available. > #3 0x0000003040071ce2 in _int_free () from /lib64/libc.so.6 No > symbol table info available. > #4 0x000000304007590c in free () from /lib64/libc.so.6 No > symbol table info available. > #5 0x00002aaaaaaadcc9 in __socket_ioq_entry_free > (entry=0x2aaab001da30) at socket.c:331 No locals. > #6 0x00002aaaaaaaf1c9 in __socket_ioq_churn_entry (this=<value > optimized out>, entry=0x2aaab001da30) at socket.c:368 > ret = 0 > __PRETTY_FUNCTION__ = "__socket_ioq_churn_entry" > #7 0x00002aaaaaaaf8ec in socket_submit (this=0xae11a70, > buf=0x2aaab00378c0 "", len=340, vector=0x0, count=<value > optimized out>, > iobref=<value optimized out>) at socket.c:1281 > priv = (socket_private_t *) 0xae11ec0 > ret = <value optimized out> > need_poll_out = <value optimized out> > entry = (struct ioq *) 0x2aaab001da30 > ctx = (glusterfs_ctx_t *) 0xae02010 > __FUNCTION__ = "socket_submit" > #8 0x00002b2e7c775bd3 in protocol_client_xfer > (frame=0x2aaab0030ab0, this=0xae0ab00, trans=0xae11a70, type=1, > op=40, hdr=0x2aaab00378c0, hdrlen=340, > vector=0x0, count=0, iobref=0x0) at client-protocol.c:636 > conf = (client_conf_t *) 0xae113c0 > conn = (client_connection_t *) 0xae11f90 > callid = 309893 > ret = <value optimized out> > rsphdr = {callid = 0, type = 0, op = 0, size = 0, {req = > {pid = 0, uid = 0, gid = 0}, rsp = {op_ret = 0, op_errno = 0}}} > forget = {hdr = 0x0, hdrlen = 0, frame = 0x0} > #9 0x00002b2e7c77bc1a in client_xattrop (frame=0x2aaab0030ab0, > this=0xae0ab00, loc=0x2aaab4004238, flags=GF_XATTROP_ADD_ARRAY, > dict=0x2aaab4031fc0) > at client-protocol.c:1922 > hdr = (gf_hdr_common_t *) 0x101010101010101 > req = <value optimized out> > dict_len = 242 > ret = <value optimized out> > pathlen = <value optimized out> > ino = 13893685 > __FUNCTION__ = "client_xattrop" > #10 0x00002b2e7c9a2cda in afr_changelog_pre_op > (frame=0x2aaab401ea70, this=0xae0b280) at afr-transaction.c:782 > _new = (call_frame_t *) 0x6943 > priv = (afr_private_t *) 0xae13740 > ret = <value optimized out> > call_count = 1 > xattr = (dict_t *) 0x2aaab4031fc0 > local = (afr_local_t *) 0x2aaab4004200 > __FUNCTION__ = "afr_changelog_pre_op" > #11 0x00002b2e7c9a2f31 in afr_lock_rec (frame=0x2aaab401ea70, > this=0xae0b280, child_index=1) at afr-transaction.c:979 > local = (afr_local_t *) 0x2aaab4004200 > priv = (afr_private_t *) 0xae13740 > flock = {l_type = 1, l_whence = 12098, l_start = 0, l_len > = 0, l_pid = 792866320} > lower = <value optimized out> > higher = <value optimized out> > lower_name = <value optimized out> > higher_name = <value optimized out> > __FUNCTION__ = "afr_lock_rec" > #12 0x00002b2e7c9a36a8 in afr_lock_cbk (frame=0x2aaab401ea70, > cookie=<value optimized out>, this=0xae0b280, op_ret=0, > op_errno=0) at afr-transaction.c:906 ---Type <return> to > continue, or q <return> to quit--- > local = (afr_local_t *) 0x2aaab4004200 > child_index = 0 > call_count = 0 > __FUNCTION__ = "afr_lock_cbk" > #13 0x00002b2e7b6a75f0 in default_inodelk_cbk (frame=<value > optimized out>, cookie=<value optimized out>, this=<value > optimized out>, op_ret=-1, > op_errno=128) at defaults.c:1199 > fn = (ret_fn_t) 0x101010101010101 > _parent = (call_frame_t *) 0x6943 > #14 0x00002b2e7c358182 in pl_inodelk (frame=0x2aaab4034c10, > this=0xae08870, volume=<value optimized out>, > loc=0x2aaab4032170, cmd=7, flock=0x0) > at internal.c:194 > fn = (ret_fn_t) 0x101010101010101 > _parent = (call_frame_t *) 0x6943 > op_ret = -1 > op_errno = 128 > ret = 0 > can_block = 1 > transport = <value optimized out> > client_pid = 1 > pinode = (pl_inode_t *) 0x2aaab0032810 > reqlock = (posix_lock_t *) 0x2aaab4032170 > __FUNCTION__ = "pl_inodelk" > #15 0x00002b2e7b6a815c in default_inodelk (frame=0x2aaab4017660, > this=0xae09080, volume=0xae0b260 "replicate", > loc=0x2aaab4004238, cmd=7, > lock=0x7fff2f422f80) at defaults.c:1209 > _new = (call_frame_t *) 0x6943 > #16 0x00002b2e7c9a33ba in afr_lock_rec (frame=0x2aaab401ea70, > this=0xae0b280, child_index=0) at afr-transaction.c:1006 > _new = (call_frame_t *) 0x6943 > local = (afr_local_t *) 0x2aaab4004200 > priv = (afr_private_t *) 0xae13740 > flock = {l_type = 1, l_whence = 1, l_start = 0, l_len = > 0, l_pid = 1074216160} > lower = <value optimized out> > higher = <value optimized out> > lower_name = <value optimized out> > higher_name = <value optimized out> > __FUNCTION__ = "afr_lock_rec" > #17 0x00002b2e7c9a35c2 in afr_transaction (frame=0x2aaab401ea70, > this=0xae0b280, type=AFR_DATA_TRANSACTION) at afr-transaction.c:1170 > local = (afr_local_t *) 0x2aaab4004200 > priv = (afr_private_t *) 0xae13740 > #18 0x00002b2e7c9a07cd in afr_truncate (frame=0x2aaab403ac30, > this=0xae0b280, loc=0x2aaab40174a0, offset=0) at > afr-inode-write.c:1224 > transaction_frame = (call_frame_t *) 0x2aaab401ea70 > op_errno = 107 > __FUNCTION__ = "afr_truncate" > #19 0x00002b2e7cbc0969 in server_truncate_resume > (frame=0x2aaab403acc0, this=<value optimized out>, > loc=0x2aaab40174a0, offset=0) at server-protocol.c:4243 > _new = (call_frame_t *) 0x6943 > __FUNCTION__ = "server_truncate_resume" > #20 0x00002b2e7b6b06f7 in call_resume (stub=0x2aaab4017470) at > call-stub.c:2384 > __FUNCTION__ = "call_resume" > #21 0x00002b2e7cbc4125 in server_truncate (frame=0x2aaab403acc0, > bound_xl=<value optimized out>, hdr=<value optimized out>, > hdrlen=<value optimized out>, > iobuf=<value optimized out>) at server-protocol.c:4291 > truncate_stub = (call_stub_t *) 0x0 > state = (server_state_t *) 0x2aaab4032000 > #22 0x00002b2e7cbbfb20 in protocol_server_pollin > (this=0xae0bdf0, trans=0xae17960) at server-protocol.c:7735 > hdr = 0x2aaab4017330 "" > hdrlen = 98 > ret = 0 > iobuf = (struct iobuf *) 0x0 > #23 0x00002b2e7cbbfbfb in notify (this=0xae0bdf0, event=<value > optimized out>, data=0x6) at server-protocol.c:7791 > ret = <value optimized out> > trans = (transport_t *) 0x6943 > ---Type <return> to continue, or q <return> to quit--- > peerinfo = (peer_info_t *) 0xae179d0 > myinfo = (peer_info_t *) 0xae17ac0 > __FUNCTION__ = "notify" > #24 0x00002aaaaaaafb43 in socket_event_handler (fd=<value > optimized out>, idx=11, data=0xae17960, poll_in=1, poll_out=0, > poll_err=0) at socket.c:813 > this = (transport_t *) 0x6943 > priv = (socket_private_t *) 0xae16ce0 > ret = 0 > #25 0x00002b2e7b6ba2a5 in event_dispatch_epoll > (event_pool=0xae02300) at event.c:804 > events = (struct epoll_event *) 0xae15990 > i = 0 > ret = 1 > __FUNCTION__ = "event_dispatch_epoll" > #26 0x0000000000403899 in main () > No symbol table info available. > > > > > > _______________________________________________ > Gluster-devel mailing list > Gluster-devel@xxxxxxxxxx <mailto:Gluster-devel@xxxxxxxxxx> > http://lists.nongnu.org/mailman/listinfo/gluster-devel > > > > _______________________________________________ > Gluster-devel mailing list > Gluster-devel@xxxxxxxxxx <mailto:Gluster-devel@xxxxxxxxxx> > http://lists.nongnu.org/mailman/listinfo/gluster-devel > > > > > -- > Raghavendra G >