glusterfs coredump--mempool

"Zhou, Cynthia (NSB - CN/Hangzhou)" <cynthia.zhou@xxxxxxxxxxxxxxx> · Tue, 21 May 2019 07:12:28 +0000

Hi glusterfs expert,
I meet glusterfs process coredump again in my env, short after glusterfs process startup. The local become NULL, but seems this frame is not destroyed yet since the magic number(GF_MEM_HEADER_MAGIC) still untouched.
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/sbin/glusterfs --acl --volfile-server=mn-0.local --volfile-server=mn-1.loc'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007f867fcd2971 in client3_3_inodelk_cbk (req=<optimized out>, iov=<optimized out>, count=<optimized out>, myframe=0x7f8654008830)
    at client-rpc-fops.c:1510
1510          CLIENT_STACK_UNWIND (inodelk, frame, rsp.op_ret,
[Current thread is 1 (Thread 0x7f867d6d4700 (LWP 3046))]
Missing separate debuginfos, use: dnf debuginfo-install glusterfs-fuse-3.12.15-1.wos2.wf29.x86_64
(gdb) bt
#0  0x00007f867fcd2971 in client3_3_inodelk_cbk (req=<optimized out>, iov=<optimized out>, count=<optimized out>, myframe=0x7f8654008830)
    at client-rpc-fops.c:1510
#1  0x00007f8685ea5584 in rpc_clnt_handle_reply (clnt=clnt@entry=0x7f8678070030, pollin=pollin@entry=0x7f86702833e0) at rpc-clnt.c:782
#2  0x00007f8685ea587b in rpc_clnt_notify (trans=<optimized out>, mydata=0x7f8678070060, event=<optimized out>, data="" at rpc-clnt.c:975
#3  0x00007f8685ea1b83 in rpc_transport_notify (this=this@entry=0x7f8678070270, event=event@entry=RPC_TRANSPORT_MSG_RECEIVED,

    data="" at rpc-transport.c:538
#4  0x00007f8680b99867 in socket_event_poll_in (notify_handled=_gf_true, this=0x7f8678070270) at socket.c:2260
#5  socket_event_handler (fd=<optimized out>, idx=3, gen=1, data="" poll_in=<optimized out>, poll_out=<optimized out>,

    poll_err=<optimized out>) at socket.c:2645
#6  0x00007f8686132911 in event_dispatch_epoll_handler (event=0x7f867d6d3e6c, event_pool=0x55e1b2792b00) at event-epoll.c:583
#7  event_dispatch_epoll_worker (data="" at event-epoll.c:659
#8  0x00007f8684ea65da in start_thread () from /lib64/libpthread.so.0
#9  0x00007f868474eeaf in clone () from /lib64/libc.so.6
(gdb) print *(call_frame_t*)myframe
$3 = {root = 0x7f86540271a0, parent = 0x0, frames = {next = 0x7f8654027898, prev = 0x7f8654027898}, local = 0x0, this = 0x7f8678013080, ret = 0x0,

  ref_count = 0, lock = {spinlock = 0, mutex = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 0, __spins = 0, __elision = 0,

        __list = {__prev = 0x0, __next = 0x0}}, __size = '\000' <repeats 39 times>, __align = 0}}, cookie = 0x0, complete = _gf_false, xid = 0,

  op = GF_FOP_NULL, begin = {tv_sec = 0, tv_usec = 0}, end = {tv_sec = 0, tv_usec = 0}, wind_from = 0x0, wind_to = 0x0, unwind_from = 0x0, unwind_to = 0x0}
(gdb) x/4xw  0x7f8654008810
0x7f8654008810:  
0xcafebabe 0x00000000 0x00000000 0x00000000
(gdb) p *(pooled_obj_hdr_t *)0x7f8654008810
$2 = {magic = 3405691582, next = 0x0, pool_list = 0x7f8654000b80, power_of_two = 8}

I add “uint32_t xid” in data structure _call_frame, and set it according to the rcpreq->xid in __save_frame function. In normal situation this xid should only be 0 immediately after create_frame from memory pool. But
 in this case this xid is 0, so seems like that the frame has been given out for use before freed. Have you any idea how this happen?

cynthia

_______________________________________________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/836554017

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/486278655

Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-devel