+Krutika ----- Original Message ----- > From: "Anoop C S" <anoopcs@xxxxxxxxxx> > To: "Atin Mukherjee" <amukherj@xxxxxxxxxx> > Cc: "Pranith Kumar Karampuri" <pkarampu@xxxxxxxxxx>, "Ravishankar N" <ravishankar@xxxxxxxxxx>, "Anuradha Talur" > <atalur@xxxxxxxxxx>, gluster-devel@xxxxxxxxxxx > Sent: Friday, April 22, 2016 2:14:28 PM > Subject: Re: Core generated by trash.t > > On Wed, 2016-04-20 at 16:24 +0530, Atin Mukherjee wrote: > > I should have said the regression link is irrelevant here. Try > > running > > this test on your local setup multiple times on mainline. I do > > believe > > you should see the crash. > > > > I could see coredump on running trash.t multiple times in a while loop. > Info from coredump: > > Core was generated by `/usr/local/sbin/glusterfs -s localhost -- > volfile-id gluster/glustershd -p /var/'. > Program terminated with signal SIGSEGV, Segmentation fault. > #0 0x000000000040bd31 in glusterfs_handle_translator_op > (req=0x7feab8001dec) at glusterfsd-mgmt.c:590 > 590 any = active->first; > [Current thread is 1 (Thread 0x7feac1657700 (LWP 12050))] > (gdb) l > 585 goto out; > 586 } > 587 > 588 ctx = glusterfsd_ctx; > 589 active = ctx->active; > 590 any = active->first; > 591 input = dict_new (); > 592 ret = dict_unserialize (xlator_req.input.input_val, > 593 xlator_req.input.input_len, > 594 &input); > (gdb) p ctx > $1 = (glusterfs_ctx_t *) 0x7fa010 > (gdb) p ctx->active > $2 = (glusterfs_graph_t *) 0x0 I think this is because the request came to shd even before the graph is intialized? Thanks for the test case. I will take a look at this. Pranith > (gdb) p *req > $1 = {trans = 0x7feab8000e20, svc = 0x83ca50, prog = 0x874810, xid = 1, > prognum = 4867634, progver = 2, procnum = 3, type = 0, uid = 0, gid = > 0, pid = 0, lk_owner = {len = 4, > data = '\000' <repeats 1023 times>}, gfs_id = 0, auxgids = > 0x7feab800223c, auxgidsmall = {0 <repeats 128 times>}, auxgidlarge = > 0x0, auxgidcount = 0, msg = {{iov_base = 0x7feacc253840, > iov_len = 488}, {iov_base = 0x0, iov_len = 0} <repeats 15 > times>}, count = 1, iobref = 0x7feab8000c40, rpc_status = 0, rpc_err = > 0, auth_err = 0, txlist = {next = 0x7feab800256c, > prev = 0x7feab800256c}, payloadsize = 0, cred = {flavour = 390039, > datalen = 24, authdata = '\000' <repeats 19 times>, "\004", '\000' > <repeats 379 times>}, verf = {flavour = 0, > datalen = 0, authdata = '\000' <repeats 399 times>}, synctask = > _gf_true, private = 0x0, trans_private = 0x0, hdr_iobuf = 0x82b038, > reply = 0x0} > (gdb) p req->procnum > $3 = 3 <== GLUSTERD_BRICK_XLATOR_OP > (gdb) t a a bt > > Thread 6 (Thread 0x7feabf178700 (LWP 12055)): > #0 0x00007feaca522043 in epoll_wait () at ../sysdeps/unix/syscall- > template.S:84 > #1 0x00007feacbe5076f in event_dispatch_epoll_worker (data=0x878130) > at event-epoll.c:664 > #2 0x00007feacac4560a in start_thread (arg=0x7feabf178700) at > pthread_create.c:334 > #3 0x00007feaca521a4d in clone () at > ../sysdeps/unix/sysv/linux/x86_64/clone.S:109 > > Thread 5 (Thread 0x7feac2659700 (LWP 12048)): > #0 do_sigwait (sig=0x7feac2658e3c, set=<optimized out>) at > ../sysdeps/unix/sysv/linux/sigwait.c:64 > #1 __sigwait (set=<optimized out>, sig=0x7feac2658e3c) at > ../sysdeps/unix/sysv/linux/sigwait.c:96 > #2 0x0000000000409895 in glusterfs_sigwaiter (arg=0x7ffe3debbf00) at > glusterfsd.c:2032 > #3 0x00007feacac4560a in start_thread (arg=0x7feac2659700) at > pthread_create.c:334 > #4 0x00007feaca521a4d in clone () at > ../sysdeps/unix/sysv/linux/x86_64/clone.S:109 > > Thread 4 (Thread 0x7feacc2b4780 (LWP 12046)): > #0 0x00007feacac466ad in pthread_join (threadid=140646205064960, > thread_return=0x0) at pthread_join.c:90 > #1 0x00007feacbe509bb in event_dispatch_epoll (event_pool=0x830b80) at > event-epoll.c:758 > #2 0x00007feacbe17a91 in event_dispatch (event_pool=0x830b80) at > event.c:124 > #3 0x000000000040a3c8 in main (argc=13, argv=0x7ffe3debd0f8) at > glusterfsd.c:2376 > > Thread 3 (Thread 0x7feac2e5a700 (LWP 12047)): > #0 0x00007feacac4e27d in nanosleep () at ../sysdeps/unix/syscall- > template.S:84 > #1 0x00007feacbdfc152 in gf_timer_proc (ctx=0x7fa010) at timer.c:188 > #2 0x00007feacac4560a in start_thread (arg=0x7feac2e5a700) at > pthread_create.c:334 > #3 0x00007feaca521a4d in clone () at > ../sysdeps/unix/sysv/linux/x86_64/clone.S:109 > > Thread 2 (Thread 0x7feac1e58700 (LWP 12049)): > #0 pthread_cond_timedwait@@GLIBC_2.3.2 () at > ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:225 > #1 0x00007feacbe2d73d in syncenv_task (proc=0x838310) at syncop.c:603 > #2 0x00007feacbe2d9dd in syncenv_processor (thdata=0x838310) at > syncop.c:695 > #3 0x00007feacac4560a in start_thread (arg=0x7feac1e58700) at > pthread_create.c:334 > #4 0x00007feaca521a4d in clone () at > ../sysdeps/unix/sysv/linux/x86_64/clone.S:109 > > Thread 1 (Thread 0x7feac1657700 (LWP 12050)): > #0 0x000000000040bd31 in glusterfs_handle_translator_op > (req=0x7feab8001dec) at glusterfsd-mgmt.c:590 > #1 0x00007feacbe2cf04 in synctask_wrap (old_task=0x7feab80031c0) at > syncop.c:375 > #2 0x00007feaca467f30 in ?? () from /lib64/libc.so.6 > #3 0x0000000000000000 in ?? () > > Looking at the core, crash was seen from > glusterfs_handle_translator_op() routine while doing a 'volume heal' > command. I could then easily create a small test case to re-produce the > issue. Please find the attachment for the same. > > --Anoop C S. > _______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-devel