Re: crash in afr

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Shehjar,
Sorry, its not double free also. I was wrong.

Mihai,

We are still looking into this bug. We'll get back to you once we fix this.

regards,

On Thu, Jun 18, 2009 at 3:01 PM, Shehjar Tikoo <shehjart@xxxxxxxxxxx> wrote:
Raghavendra G wrote:
While this fixes the double free, The actual fix has to copy the buffer into an ioq_entry, instead of just storing the buffer pointer. If not, there can be cases wherein by the time the ioq_entry is written to socket, the buffer might've already been freed.

Yup. I hadnt seen your reply to the bug report when I sent this patch.

Thanks
Shehjar


On Thu, Jun 18, 2009 at 2:36 PM, Shehjar Tikoo <shehjart@xxxxxxxxxxx <mailto:shehjart@xxxxxxxxxxx>> wrote:

   I think I understand why you see the crash.
   Could you please apply the following patch and tell
   us if the crash is observed still?

   Thanks
   Shehjar




   Mihai wrote:

       Hello,
       I'm using a server side replication on 6 servers. Glusterfsd
       crashes on a few hour basis:
       gdb -se /usr/sbin/glusterfsd -c /core.26947 GNU gdb Fedora
       (6.8-27.el5) Copyright (C) 2008 Free Software Foundation, Inc.
       License GPLv3+: GNU GPL version 3 or later
       <http://gnu.org/licenses/gpl.html>
       This is free software: you are free to change and redistribute it.
       There is NO WARRANTY, to the extent permitted by law.  Type
       "show copying"
       and "show warranty" for details.
       This GDB was configured as "x86_64-redhat-linux-gnu"...
       (no debugging symbols found)

       warning: .dynamic section for "/usr/lib64/libglusterfs.so.0" is
       not at the expected address

       warning: difference appears to be caused by prelink, adjusting
       expectations Reading symbols from
       /usr/lib64/libglusterfs.so.0...done.
       Loaded symbols for /usr/lib64/libglusterfs.so.0 Reading symbols
       from /lib64/libdl.so.2...done.
       Loaded symbols for /lib64/libdl.so.2
       Reading symbols from /lib64/libpthread.so.0...done.
       Loaded symbols for /lib64/libpthread.so.0 Reading symbols from
       /lib64/libc.so.6...done.
       Loaded symbols for /lib64/libc.so.6
       Reading symbols from /lib64/ld-linux-x86-64.so.2...done.
       Loaded symbols for /lib64/ld-linux-x86-64.so.2 Reading symbols
       from /usr/lib64/glusterfs/2.0.2/xlator/storage/posix.so...done.
       Loaded symbols for
       /usr/lib64/glusterfs/2.0.2/xlator/storage/posix.so
       Reading symbols from
       /usr/lib64/glusterfs/2.0.2/xlator/features/locks.so...done.
       Loaded symbols for
       /usr/lib64/glusterfs/2.0.2/xlator/features/locks.so
       Reading symbols from
       /usr/lib64/glusterfs/2.0.2/xlator/performance/io-threads.so...done.
       Loaded symbols for
       /usr/lib64/glusterfs/2.0.2/xlator/performance/io-threads.so
       Reading symbols from
       /usr/lib64/glusterfs/2.0.2/xlator/protocol/client.so...done.
       Loaded symbols for
       /usr/lib64/glusterfs/2.0.2/xlator/protocol/client.so
       Reading symbols from
       /usr/lib64/glusterfs/2.0.2/xlator/cluster/replicate.so...done.
       Loaded symbols for
       /usr/lib64/glusterfs/2.0.2/xlator/cluster/replicate.so
       Reading symbols from
       /usr/lib64/glusterfs/2.0.2/xlator/protocol/server.so...done.
       Loaded symbols for
       /usr/lib64/glusterfs/2.0.2/xlator/protocol/server.so
       Reading symbols from
       /usr/lib64/glusterfs/2.0.2/transport/socket.so...done.
       Loaded symbols for /usr/lib64/glusterfs/2.0.2/transport/socket.so
       Reading symbols from /usr/lib64/glusterfs/2.0.2/auth/addr.so...done.
       Loaded symbols for /usr/lib64/glusterfs/2.0.2/auth/addr.so
       Reading symbols from /lib64/libnss_files.so.2...done.
       Loaded symbols for /lib64/libnss_files.so.2 Reading symbols from
       /lib64/libgcc_s.so.1...done.
       Loaded symbols for /lib64/libgcc_s.so.1
       Core was generated by `/usr/sbin/glusterfsd -f
       /etc/glusterfs/glusterfsd.vol'.
       Program terminated with signal 6, Aborted.
       [New process 26947]
       [New process 26956]
       [New process 26955]
       [New process 26954]
       [New process 26953]
       [New process 26952]
       [New process 26951]
       [New process 26950]
       [New process 26949]
       [New process 26948]
       #0  0x0000003040030215 in raise () from /lib64/libc.so.6
       (gdb) bt
       #0  0x0000003040030215 in raise () from /lib64/libc.so.6
       #1  0x0000003040031cc0 in abort () from /lib64/libc.so.6
       #2  0x000000304006a7fb in __libc_message () from /lib64/libc.so.6
       #3  0x0000003040071ce2 in _int_free () from /lib64/libc.so.6
       #4  0x000000304007590c in free () from /lib64/libc.so.6
       #5  0x00002aaaaaaadcc9 in __socket_ioq_entry_free
       (entry=0x2aaab001da30) at socket.c:331
       #6  0x00002aaaaaaaf1c9 in __socket_ioq_churn_entry (this=<value
       optimized out>, entry=0x2aaab001da30) at socket.c:368
       #7  0x00002aaaaaaaf8ec in socket_submit (this=0xae11a70,
       buf=0x2aaab00378c0 "", len=340, vector=0x0, count=<value
       optimized out>,
          iobref=<value optimized out>) at socket.c:1281
       #8  0x00002b2e7c775bd3 in protocol_client_xfer
       (frame=0x2aaab0030ab0, this=0xae0ab00, trans=0xae11a70, type=1,
       op=40, hdr=0x2aaab00378c0, hdrlen=340,
          vector=0x0, count=0, iobref=0x0) at client-protocol.c:636
       #9  0x00002b2e7c77bc1a in client_xattrop (frame=0x2aaab0030ab0,
       this=0xae0ab00, loc=0x2aaab4004238, flags=GF_XATTROP_ADD_ARRAY,
       dict=0x2aaab4031fc0)
          at client-protocol.c:1922
       #10 0x00002b2e7c9a2cda in afr_changelog_pre_op
       (frame=0x2aaab401ea70, this=0xae0b280) at afr-transaction.c:782
       #11 0x00002b2e7c9a2f31 in afr_lock_rec (frame=0x2aaab401ea70,
       this=0xae0b280, child_index=1) at afr-transaction.c:979
       #12 0x00002b2e7c9a36a8 in afr_lock_cbk (frame=0x2aaab401ea70,
       cookie=<value optimized out>, this=0xae0b280, op_ret=0,
       op_errno=0) at afr-transaction.c:906
       #13 0x00002b2e7b6a75f0 in default_inodelk_cbk (frame=<value
       optimized out>, cookie=<value optimized out>, this=<value
       optimized out>, op_ret=-1,
          op_errno=128) at defaults.c:1199
       #14 0x00002b2e7c358182 in pl_inodelk (frame=0x2aaab4034c10,
       this=0xae08870, volume=<value optimized out>,
       loc=0x2aaab4032170, cmd=7, flock=0x0)
          at internal.c:194
       #15 0x00002b2e7b6a815c in default_inodelk (frame=0x2aaab4017660,
       this=0xae09080, volume=0xae0b260 "replicate",
       loc=0x2aaab4004238, cmd=7,
          lock=0x7fff2f422f80) at defaults.c:1209
       #16 0x00002b2e7c9a33ba in afr_lock_rec (frame=0x2aaab401ea70,
       this=0xae0b280, child_index=0) at afr-transaction.c:1006
       #17 0x00002b2e7c9a35c2 in afr_transaction (frame=0x2aaab401ea70,
       this=0xae0b280, type=AFR_DATA_TRANSACTION) at afr-transaction.c:1170
       #18 0x00002b2e7c9a07cd in afr_truncate (frame=0x2aaab403ac30,
       this=0xae0b280, loc=0x2aaab40174a0, offset=0) at
       afr-inode-write.c:1224
       #19 0x00002b2e7cbc0969 in server_truncate_resume
       (frame=0x2aaab403acc0, this=<value optimized out>,
       loc=0x2aaab40174a0, offset=0) at server-protocol.c:4243 #20
       0x00002b2e7b6b06f7 in call_resume (stub=0x2aaab4017470) at
       call-stub.c:2384
       #21 0x00002b2e7cbc4125 in server_truncate (frame=0x2aaab403acc0,
       bound_xl=<value optimized out>, hdr=<value optimized out>,
       hdrlen=<value optimized out>,
          iobuf=<value optimized out>) at server-protocol.c:4291
       #22 0x00002b2e7cbbfb20 in protocol_server_pollin
       (this=0xae0bdf0, trans=0xae17960) at server-protocol.c:7735
       #23 0x00002b2e7cbbfbfb in notify (this=0xae0bdf0, event=<value
       optimized out>, data="" at server-protocol.c:7791
       #24 0x00002aaaaaaafb43 in socket_event_handler (fd=<value
       optimized out>, idx=11, data="" poll_in=1, poll_out=0,
       poll_err=0) at socket.c:813
       #25 0x00002b2e7b6ba2a5 in event_dispatch_epoll
       (event_pool=0xae02300) at event.c:804
       #26 0x0000000000403899 in main ()
       (gdb) bt full
       #0  0x0000003040030215 in raise () from /lib64/libc.so.6 No
       symbol table info available.
       #1  0x0000003040031cc0 in abort () from /lib64/libc.so.6 No
       symbol table info available.
       #2  0x000000304006a7fb in __libc_message () from
       /lib64/libc.so.6 No symbol table info available.
       #3  0x0000003040071ce2 in _int_free () from /lib64/libc.so.6 No
       symbol table info available.
       #4  0x000000304007590c in free () from /lib64/libc.so.6 No
       symbol table info available.
       #5  0x00002aaaaaaadcc9 in __socket_ioq_entry_free
       (entry=0x2aaab001da30) at socket.c:331 No locals.
       #6  0x00002aaaaaaaf1c9 in __socket_ioq_churn_entry (this=<value
       optimized out>, entry=0x2aaab001da30) at socket.c:368
              ret = 0
              __PRETTY_FUNCTION__ = "__socket_ioq_churn_entry"
       #7  0x00002aaaaaaaf8ec in socket_submit (this=0xae11a70,
       buf=0x2aaab00378c0 "", len=340, vector=0x0, count=<value
       optimized out>,
          iobref=<value optimized out>) at socket.c:1281
              priv = (socket_private_t *) 0xae11ec0
              ret = <value optimized out>
              need_poll_out = <value optimized out>
              entry = (struct ioq *) 0x2aaab001da30
              ctx = (glusterfs_ctx_t *) 0xae02010
              __FUNCTION__ = "socket_submit"
       #8  0x00002b2e7c775bd3 in protocol_client_xfer
       (frame=0x2aaab0030ab0, this=0xae0ab00, trans=0xae11a70, type=1,
       op=40, hdr=0x2aaab00378c0, hdrlen=340,
          vector=0x0, count=0, iobref=0x0) at client-protocol.c:636
              conf = (client_conf_t *) 0xae113c0
              conn = (client_connection_t *) 0xae11f90
              callid = 309893
              ret = <value optimized out>
              rsphdr = {callid = 0, type = 0, op = 0, size = 0, {req =
       {pid = 0, uid = 0, gid = 0}, rsp = {op_ret = 0, op_errno = 0}}}
              forget = {hdr = 0x0, hdrlen = 0, frame = 0x0}
       #9  0x00002b2e7c77bc1a in client_xattrop (frame=0x2aaab0030ab0,
       this=0xae0ab00, loc=0x2aaab4004238, flags=GF_XATTROP_ADD_ARRAY,
       dict=0x2aaab4031fc0)
          at client-protocol.c:1922
              hdr = (gf_hdr_common_t *) 0x101010101010101
              req = <value optimized out>
              dict_len = 242
              ret = <value optimized out>
              pathlen = <value optimized out>
              ino = 13893685
              __FUNCTION__ = "client_xattrop"
       #10 0x00002b2e7c9a2cda in afr_changelog_pre_op
       (frame=0x2aaab401ea70, this=0xae0b280) at afr-transaction.c:782
              _new = (call_frame_t *) 0x6943
              priv = (afr_private_t *) 0xae13740
              ret = <value optimized out>
              call_count = 1
              xattr = (dict_t *) 0x2aaab4031fc0
              local = (afr_local_t *) 0x2aaab4004200
              __FUNCTION__ = "afr_changelog_pre_op"
       #11 0x00002b2e7c9a2f31 in afr_lock_rec (frame=0x2aaab401ea70,
       this=0xae0b280, child_index=1) at afr-transaction.c:979
              local = (afr_local_t *) 0x2aaab4004200
              priv = (afr_private_t *) 0xae13740
              flock = {l_type = 1, l_whence = 12098, l_start = 0, l_len
       = 0, l_pid = 792866320}
              lower = <value optimized out>
              higher = <value optimized out>
              lower_name = <value optimized out>
              higher_name = <value optimized out>
              __FUNCTION__ = "afr_lock_rec"
       #12 0x00002b2e7c9a36a8 in afr_lock_cbk (frame=0x2aaab401ea70,
       cookie=<value optimized out>, this=0xae0b280, op_ret=0,
       op_errno=0) at afr-transaction.c:906 ---Type <return> to
       continue, or q <return> to quit---
              local = (afr_local_t *) 0x2aaab4004200
              child_index = 0
              call_count = 0
              __FUNCTION__ = "afr_lock_cbk"
       #13 0x00002b2e7b6a75f0 in default_inodelk_cbk (frame=<value
       optimized out>, cookie=<value optimized out>, this=<value
       optimized out>, op_ret=-1,
          op_errno=128) at defaults.c:1199
              fn = (ret_fn_t) 0x101010101010101
              _parent = (call_frame_t *) 0x6943
       #14 0x00002b2e7c358182 in pl_inodelk (frame=0x2aaab4034c10,
       this=0xae08870, volume=<value optimized out>,
       loc=0x2aaab4032170, cmd=7, flock=0x0)
          at internal.c:194
              fn = (ret_fn_t) 0x101010101010101
              _parent = (call_frame_t *) 0x6943
              op_ret = -1
              op_errno = 128
              ret = 0
              can_block = 1
              transport = <value optimized out>
              client_pid = 1
              pinode = (pl_inode_t *) 0x2aaab0032810
              reqlock = (posix_lock_t *) 0x2aaab4032170
              __FUNCTION__ = "pl_inodelk"
       #15 0x00002b2e7b6a815c in default_inodelk (frame=0x2aaab4017660,
       this=0xae09080, volume=0xae0b260 "replicate",
       loc=0x2aaab4004238, cmd=7,
          lock=0x7fff2f422f80) at defaults.c:1209
              _new = (call_frame_t *) 0x6943
       #16 0x00002b2e7c9a33ba in afr_lock_rec (frame=0x2aaab401ea70,
       this=0xae0b280, child_index=0) at afr-transaction.c:1006
              _new = (call_frame_t *) 0x6943
              local = (afr_local_t *) 0x2aaab4004200
              priv = (afr_private_t *) 0xae13740
              flock = {l_type = 1, l_whence = 1, l_start = 0, l_len =
       0, l_pid = 1074216160}
              lower = <value optimized out>
              higher = <value optimized out>
              lower_name = <value optimized out>
              higher_name = <value optimized out>
              __FUNCTION__ = "afr_lock_rec"
       #17 0x00002b2e7c9a35c2 in afr_transaction (frame=0x2aaab401ea70,
       this=0xae0b280, type=AFR_DATA_TRANSACTION) at afr-transaction.c:1170
              local = (afr_local_t *) 0x2aaab4004200
              priv = (afr_private_t *) 0xae13740
       #18 0x00002b2e7c9a07cd in afr_truncate (frame=0x2aaab403ac30,
       this=0xae0b280, loc=0x2aaab40174a0, offset=0) at
       afr-inode-write.c:1224
              transaction_frame = (call_frame_t *) 0x2aaab401ea70
              op_errno = 107
              __FUNCTION__ = "afr_truncate"
       #19 0x00002b2e7cbc0969 in server_truncate_resume
       (frame=0x2aaab403acc0, this=<value optimized out>,
       loc=0x2aaab40174a0, offset=0) at server-protocol.c:4243
              _new = (call_frame_t *) 0x6943
              __FUNCTION__ = "server_truncate_resume"
       #20 0x00002b2e7b6b06f7 in call_resume (stub=0x2aaab4017470) at
       call-stub.c:2384
              __FUNCTION__ = "call_resume"
       #21 0x00002b2e7cbc4125 in server_truncate (frame=0x2aaab403acc0,
       bound_xl=<value optimized out>, hdr=<value optimized out>,
       hdrlen=<value optimized out>,
          iobuf=<value optimized out>) at server-protocol.c:4291
              truncate_stub = (call_stub_t *) 0x0
              state = (server_state_t *) 0x2aaab4032000
       #22 0x00002b2e7cbbfb20 in protocol_server_pollin
       (this=0xae0bdf0, trans=0xae17960) at server-protocol.c:7735
              hdr = 0x2aaab4017330 ""
              hdrlen = 98
              ret = 0
              iobuf = (struct iobuf *) 0x0
       #23 0x00002b2e7cbbfbfb in notify (this=0xae0bdf0, event=<value
       optimized out>, data="" at server-protocol.c:7791
              ret = <value optimized out>
              trans = (transport_t *) 0x6943
       ---Type <return> to continue, or q <return> to quit---
              peerinfo = (peer_info_t *) 0xae179d0
              myinfo = (peer_info_t *) 0xae17ac0
              __FUNCTION__ = "notify"
       #24 0x00002aaaaaaafb43 in socket_event_handler (fd=<value
       optimized out>, idx=11, data="" poll_in=1, poll_out=0,
       poll_err=0) at socket.c:813
              this = (transport_t *) 0x6943
              priv = (socket_private_t *) 0xae16ce0
              ret = 0
       #25 0x00002b2e7b6ba2a5 in event_dispatch_epoll
       (event_pool=0xae02300) at event.c:804
              events = (struct epoll_event *) 0xae15990
              i = 0
              ret = 1
              __FUNCTION__ = "event_dispatch_epoll"
       #26 0x0000000000403899 in main ()
       No symbol table info available.





       _______________________________________________
       Gluster-devel mailing list
       Gluster-devel@xxxxxxxxxx <mailto:Gluster-devel@xxxxxxxxxx>

       http://lists.nongnu.org/mailman/listinfo/gluster-devel



   _______________________________________________
   Gluster-devel mailing list
   Gluster-devel@xxxxxxxxxx <mailto:Gluster-devel@xxxxxxxxxx>




--
Raghavendra G


[Index of Archives]     [Gluster Users]     [Ceph Users]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux