BUG: 764964 (dead lock)

chyd at ihep.ac.cn (chyd) · Thu, 10 May 2012 09:59:10 +0800

Hi all,

I'm encountering a lockup problem many times when reading/writing large numbers of files. I cannot break out of the race in gdb, a ps will lock up when it tries to read that process' data, df (of course) locks up. No kill signals have any effect. Except 'pidstat -p ALL' can get the pid, I could't do anything. The only way out of it is to umount -f. 
I am using gluster 3.2.6 on CentOS 6.0 (2.6.32-71.el6.x86_64).  

The problem is the same as BUG 764964 (https://bugzilla.redhat.com/show_bug.cgi?id=764964). and it is difficult to duplicate, I am find a way to produce it quickly. Any one else also encountered this problem? How do you solve it?

Attached dmesg log:
May 10 00:01:52 PPC-002 kernel: INFO: task glusterfs:27888 blocked for more than 120 seconds.
May 10 00:01:52 PPC-002 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 10 00:01:52 PPC-002 kernel: glusterfs          D ffff88033fc24b00     0 27888      1 0x00000080
May 10 00:01:52 PPC-002 kernel: ffff8806310fbe48 0000000000000086 0000000000000000 ffff8806310fbc58
May 10 00:01:52 PPC-002 kernel: ffff8806310fbdc8 0000000000020010 ffff8806310fbee8 00000001021f184a
May 10 00:01:52 PPC-002 kernel: ffff8806311a0678 ffff8806310fbfd8 0000000000010518 ffff8806311a0678
May 10 00:01:52 PPC-002 kernel: Call Trace:
May 10 00:01:52 PPC-002 kernel: [<ffffffff814ca6b5>] rwsem_down_failed_common+0x95/0x1d0
May 10 00:01:52 PPC-002 kernel: [<ffffffff814ca813>] rwsem_down_write_failed+0x23/0x30
May 10 00:01:52 PPC-002 kernel: [<ffffffff81264253>] call_rwsem_down_write_failed+0x13/0x20
May 10 00:01:52 PPC-002 kernel: [<ffffffff814c9d12>] ? down_write+0x32/0x40
May 10 00:01:52 PPC-002 kernel: [<ffffffff8113b468>] sys_munmap+0x48/0x80
May 10 00:01:52 PPC-002 kernel: [<ffffffff81013172>] system_call_fastpath+0x16/0x1b

Thank you in advance.
Yaodong

2012-05-10
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gluster.org/pipermail/gluster-users/attachments/20120510/667d7847/attachment.htm>