On Thu, Dec 31, 2015 at 03:40:54PM +0530, Raghavendra Talur wrote: We have threads sleeping, either voluntary (nanosleep) or not (lwp_park), and this: c5223a80 (glusterfs) is in sleepq_block/cv_timedwait_sig/sbwait/soreceive/soo_read/do_filereadv/sys_readv Awaiting while reading on a socket. Probably FUSE, but it would be nice to be certain. c5346540 (glusterfs) is in sleepq_block/cv_timedwait_sig/sigtimedwait1/sys_____sigtimedwait50 This is ordinary sigtimedwait() but the timeout arguent (third) is zero, which can let it sleep forever. Is it expected? > cv_timedwait_sig(c53466b4,c5004b80,0,c53466a4,3,db727e90,c53466a4,c41eb528,db727eac,7ff0) c5418020 (glusterfs) is in sleepq_block/sel_do_scan/pollcommon/sys_poll This is orinary poll(2). The struct timespec for the timeout is at db721f18 and again this is an infinite timeout; crash> x db721f18,2 db721f18: 0 0 (NB: 2 words because we run a a 32 bit machine, struct timespec is a 32 bit time_t and a 32 bit long) c53692c0 (perfused) is in sleepq_block/cv_timedwait_sig/kevent1/sys___kevent50 Awaiting for data (either from kernel or glusterfs, I do not know). Again we have an inifinite timeout. I note that the FUSE filesystem is responding. Since perfused is not multithreaded, it suggests it is not the stuck process. It may have missed a request or reply, though, which would stuck the calling process. Speaking about the calling process. I beleive it is the quota utility? Indeed awaiting for a reply from the filesystem: UID PID PPID CPU PRI NI VSZ RSS WCHAN STAT TTY TIME COMMAND 0 15221 1406 1546 85 0 3360 1080 puffsrpl I pts/0- 0:00.06 tests/basic/quota /mnt/glusterfs/0/test_dir/1.txt 256 48 Here is its backtrace obtained from gdb: #0 0xbb69b6f7 in write () from /usr/lib/libc.so.12 #1 0x080489c0 in nwrite (fd=3, buf=0xbb501000, count=262144) at tests/basic/quota.c:16 #2 0x08048a8b in file_write ( filename=0xbf7ffcb2 "/mnt/glusterfs/0/test_dir/1.txt", bs=262144, count=48) at tests/basic/quota.c:48 #3 0x08048b64 in main (argc=4, argv=0xbf7feba0) at tests/basic/quota.c:83 It is awaiting for a write to complete, but we still do not know what process got the request and not the reply. Do you see any way to tell? -- Emmanuel Dreyfus manu@xxxxxxxxxx _______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-devel