GFS2 Problem

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,
 
We have a 24 node cluster with shared gfs2 partitions used for shared software (computational software, compilers, etc.) and users home directories.  Periodically we’re seeing the below messages in dmesg and eventually users are unable to login due to what I assume is problems getting locks on the necessary files.  When the problems starts commands like gfs2_tool df or ls (on the gfs2 directories only) hang.  After a reboot it’s fine for a while, but after a day or two we’ll start seeing the same messages again.  I’ve been looking through the documentation and it looks like everything is setup correctly, but it’s possible I missed something.  Does anyone have any suggestions on what might be wrong?  I’d rather get this working correctly than scrap it and use NFS as the shared storage.
 
The kernel across all nodes is version 2.6.18-194.11.3.el5.  If there’s any other information that would be useful let me know and I will gladly provide it.
 
Dmesg output:
 
Oct  1 09:36:58 cluster kernel: INFO: task gfs2_quotad:5142 blocked for more than 120 seconds.
Oct  1 09:36:58 cluster kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct  1 09:36:58 cluster kernel: gfs2_quotad   D ffffffff80150839     0  5142    135          5150  5141 (L-TLB)
Oct  1 09:36:58 cluster kernel:  ffff81010dcefcd0 0000000000000046 0000000000000018 ffffffff8863b4f3
Oct  1 09:36:58 cluster kernel:  0000000000000286 000000000000000a ffff81021fe02820 ffff8101239bb820
Oct  1 09:36:58 cluster kernel:  000042fb4fc836cb 0000000000008c76 ffff81021fe02a08 000000098863ce5a
Oct  1 09:36:58 cluster kernel: Call Trace:
Oct  1 09:36:58 cluster kernel:  [<ffffffff8863b4f3>] :dlm:request_lock+0x93/0xa0
Oct  1 09:36:58 cluster kernel:  [<ffffffff88666ee7>] :gfs2:just_schedule+0x0/0xe
Oct  1 09:36:58 cluster kernel:  [<ffffffff88666ef0>] :gfs2:just_schedule+0x9/0xe
Oct  1 09:36:58 cluster kernel:  [<ffffffff80063a16>] __wait_on_bit+0x40/0x6e
Oct  1 09:36:58 cluster kernel:  [<ffffffff88666ee7>] :gfs2:just_schedule+0x0/0xe
Oct  1 09:36:58 cluster kernel:  [<ffffffff80063ab0>] out_of_line_wait_on_bit+0x6c/0x78
Oct  1 09:36:58 cluster kernel:  [<ffffffff800a0a06>] wake_bit_function+0x0/0x23
Oct  1 09:36:58 cluster kernel:  [<ffffffff88666ee2>] :gfs2:gfs2_glock_wait+0x2b/0x30
Oct  1 09:36:58 cluster kernel:  [<ffffffff8867d5f2>] :gfs2:gfs2_statfs_sync+0x3f/0x165
Oct  1 09:36:58 cluster kernel:  [<ffffffff8867d5ea>] :gfs2:gfs2_statfs_sync+0x37/0x165
Oct  1 09:36:58 cluster kernel:  [<ffffffff8005b97e>] del_timer_sync+0xc/0x16
Oct  1 09:36:58 cluster kernel:  [<ffffffff886773e3>] :gfs2:quotad_check_timeo+0x20/0x60
Oct  1 09:36:58 cluster kernel:  [<ffffffff88678ee0>] :gfs2:gfs2_quotad+0xde/0x214
Oct  1 09:36:58 cluster kernel:  [<ffffffff800a09d8>] autoremove_wake_function+0x0/0x2e
Oct  1 09:36:58 cluster kernel:  [<ffffffff88678e02>] :gfs2:gfs2_quotad+0x0/0x214
Oct  1 09:36:58 cluster kernel:  [<ffffffff800a07c0>] keventd_create_kthread+0x0/0xc4
Oct  1 09:36:58 cluster kernel:  [<ffffffff8003290a>] kthread+0xfe/0x132
Oct  1 09:36:58 cluster kernel:  [<ffffffff8005dfb1>] child_rip+0xa/0x11
Oct  1 09:36:58 cluster kernel:  [<ffffffff800a07c0>] keventd_create_kthread+0x0/0xc4
Oct  1 09:36:58 cluster kernel:  [<ffffffff8003280c>] kthread+0x0/0x132
Oct  1 09:36:58 cluster kernel:  [<ffffffff8005dfa7>] child_rip+0x0/0x11
Oct  1 09:36:58 cluster kernel:
Oct  1 09:38:58 cluster kernel: INFO: task gfs2_quotad:5142 blocked for more than 120 seconds.
Oct  1 09:38:58 cluster kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct  1 09:38:58 cluster kernel: gfs2_quotad   D ffffffff80150839     0  5142    135          5150  5141 (L-TLB)
Oct  1 09:38:58 cluster kernel:  ffff81010dcefcd0 0000000000000046 0000000000000018 ffffffff8863b4f3
Oct  1 09:38:58 cluster kernel:  0000000000000286 000000000000000a ffff81021fe02820 ffff8101239bb820
Oct  1 09:38:58 cluster kernel:  000042fb4fc836cb 0000000000008c76 ffff81021fe02a08 000000098863ce5a
Oct  1 09:38:58 cluster kernel: Call Trace:
Oct  1 09:38:58 cluster kernel:  [<ffffffff8863b4f3>] :dlm:request_lock+0x93/0xa0
Oct  1 09:38:58 cluster kernel:  [<ffffffff88666ee7>] :gfs2:just_schedule+0x0/0xe
Oct  1 09:38:58 cluster kernel:  [<ffffffff88666ef0>] :gfs2:just_schedule+0x9/0xe
Oct  1 09:38:58 cluster kernel:  [<ffffffff80063a16>] __wait_on_bit+0x40/0x6e
Oct  1 09:38:58 cluster kernel:  [<ffffffff88666ee7>] :gfs2:just_schedule+0x0/0xe
Oct  1 09:38:58 cluster kernel:  [<ffffffff80063ab0>] out_of_line_wait_on_bit+0x6c/0x78
Oct  1 09:38:58 cluster kernel:  [<ffffffff800a0a06>] wake_bit_function+0x0/0x23
Oct  1 09:38:58 cluster kernel:  [<ffffffff88666ee2>] :gfs2:gfs2_glock_wait+0x2b/0x30
Oct  1 09:38:58 cluster kernel:  [<ffffffff8867d5f2>] :gfs2:gfs2_statfs_sync+0x3f/0x165
Oct  1 09:38:58 cluster kernel:  [<ffffffff8867d5ea>] :gfs2:gfs2_statfs_sync+0x37/0x165
Oct  1 09:38:58 cluster kernel:  [<ffffffff8005b97e>] del_timer_sync+0xc/0x16
Oct  1 09:38:58 cluster kernel:  [<ffffffff886773e3>] :gfs2:quotad_check_timeo+0x20/0x60
Oct  1 09:38:58 cluster kernel:  [<ffffffff88678ee0>] :gfs2:gfs2_quotad+0xde/0x214
Oct  1 09:38:58 cluster kernel:  [<ffffffff800a09d8>] autoremove_wake_function+0x0/0x2e
Oct  1 09:38:58 cluster kernel:  [<ffffffff88678e02>] :gfs2:gfs2_quotad+0x0/0x214
Oct  1 09:38:58 cluster kernel:  [<ffffffff800a07c0>] keventd_create_kthread+0x0/0xc4
Oct  1 09:38:58 cluster kernel:  [<ffffffff8003290a>] kthread+0xfe/0x132
Oct  1 09:38:58 cluster kernel:  [<ffffffff8005dfb1>] child_rip+0xa/0x11
Oct  1 09:38:58 cluster kernel:  [<ffffffff800a07c0>] keventd_create_kthread+0x0/0xc4
Oct  1 09:38:58 cluster kernel:  [<ffffffff8003280c>] kthread+0x0/0x132
Oct  1 09:38:58 cluster kernel:  [<ffffffff8005dfa7>] child_rip+0x0/0x11
Oct  1 09:38:58 cluster kernel:
Oct  1 09:38:58 cluster kernel: INFO: task gfs2_quotad:5166 blocked for more than 120 seconds.
Oct  1 09:38:58 cluster kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct  1 09:38:58 cluster kernel: gfs2_quotad   D ffffffff80150839     0  5166    135          5174  5165 (L-TLB)
Oct  1 09:38:59 cluster kernel:  ffff81021ad53cd0 0000000000000046 0000000000000018 ffffffff8863b4f3
Oct  1 09:38:59 cluster kernel:  0000000000000286 000000000000000a ffff81011be77080 ffff8101239bb820
Oct  1 09:38:59 cluster kernel:  0000431153553f46 0000000000008079 ffff81011be77268 000000098863ce5a
Oct  1 09:38:59 cluster kernel: Call Trace:
Oct  1 09:38:59 cluster kernel:  [<ffffffff8863b4f3>] :dlm:request_lock+0x93/0xa0
Oct  1 09:38:59 cluster kernel:  [<ffffffff88666ee7>] :gfs2:just_schedule+0x0/0xe
Oct  1 09:38:59 cluster kernel:  [<ffffffff88666ef0>] :gfs2:just_schedule+0x9/0xe
Oct  1 09:38:59 cluster kernel:  [<ffffffff80063a16>] __wait_on_bit+0x40/0x6e
Oct  1 09:38:59 cluster kernel:  [<ffffffff88666ee7>] :gfs2:just_schedule+0x0/0xe
Oct  1 09:38:59 cluster kernel:  [<ffffffff80063ab0>] out_of_line_wait_on_bit+0x6c/0x78
Oct  1 09:38:59 cluster kernel:  [<ffffffff800a0a06>] wake_bit_function+0x0/0x23
Oct  1 09:38:59 cluster kernel:  [<ffffffff88666ee2>] :gfs2:gfs2_glock_wait+0x2b/0x30
Oct  1 09:38:59 cluster kernel:  [<ffffffff8867d5f2>] :gfs2:gfs2_statfs_sync+0x3f/0x165
Oct  1 09:38:59 cluster kernel:  [<ffffffff8867d5ea>] :gfs2:gfs2_statfs_sync+0x37/0x165
Oct  1 09:38:59 cluster kernel:  [<ffffffff8005b97e>] del_timer_sync+0xc/0x16
Oct  1 09:38:59 cluster kernel:  [<ffffffff886773e3>] :gfs2:quotad_check_timeo+0x20/0x60
Oct  1 09:38:59 cluster kernel:  [<ffffffff88678ee0>] :gfs2:gfs2_quotad+0xde/0x214
Oct  1 09:38:59 cluster kernel:  [<ffffffff800a09d8>] autoremove_wake_function+0x0/0x2e
Oct  1 09:38:59 cluster kernel:  [<ffffffff88678e02>] :gfs2:gfs2_quotad+0x0/0x214
Oct  1 09:38:59 cluster kernel:  [<ffffffff800a07c0>] keventd_create_kthread+0x0/0xc4
Oct  1 09:38:59 cluster kernel:  [<ffffffff8003290a>] kthread+0xfe/0x132
Oct  1 09:38:59 cluster kernel:  [<ffffffff8005dfb1>] child_rip+0xa/0x11
Oct  1 09:38:59 cluster kernel:  [<ffffffff800a07c0>] keventd_create_kthread+0x0/0xc4
Oct  1 09:38:59 cluster kernel:  [<ffffffff8003280c>] kthread+0x0/0x132
Oct  1 09:38:59 cluster kernel:  [<ffffffff8005dfa7>] child_rip+0x0/0x11
Oct  1 09:38:59 cluster kernel:
Oct  1 09:40:58 cluster kernel: INFO: task pdflush:534 blocked for more than 120 seconds.
Oct  1 09:40:58 cluster kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct  1 09:40:58 cluster kernel: pdflush       D ffffffff80150839     0   534    135           535   533 (L-TLB)
Oct  1 09:40:58 cluster kernel:  ffff81011f107bd0 0000000000000046 00000000fffffff5 ffff81010eb9d800
Oct  1 09:40:58 cluster kernel:  0000000000000286 000000000000000a ffff81021f4357e0 ffff81021ff52860
Oct  1 09:40:58 cluster kernel:  0000432045835912 0000000000003241 ffff81021f4359c8 000000028863ce5a
Oct  1 09:40:58 cluster kernel: Call Trace:
Oct  1 09:40:58 cluster kernel:  [<ffffffff88666ee7>] :gfs2:just_schedule+0x0/0xe
Oct  1 09:40:58 cluster kernel:  [<ffffffff88666ef0>] :gfs2:just_schedule+0x9/0xe
Oct  1 09:40:58 cluster kernel:  [<ffffffff80063a16>] __wait_on_bit+0x40/0x6e
Oct  1 09:40:58 cluster kernel:  [<ffffffff88666ee7>] :gfs2:just_schedule+0x0/0xe
Oct  1 09:40:58 cluster kernel:  [<ffffffff80063ab0>] out_of_line_wait_on_bit+0x6c/0x78
Oct  1 09:40:58 cluster kernel:  [<ffffffff800a0a06>] wake_bit_function+0x0/0x23
Oct  1 09:40:58 cluster kernel:  [<ffffffff88666ee2>] :gfs2:gfs2_glock_wait+0x2b/0x30
Oct  1 09:40:58 cluster kernel:  [<ffffffff8867728d>] :gfs2:gfs2_write_inode+0x5f/0x152
Oct  1 09:40:58 cluster kernel:  [<ffffffff88677285>] :gfs2:gfs2_write_inode+0x57/0x152
Oct  1 09:40:58 cluster kernel:  [<ffffffff8002fc6e>] __writeback_single_inode+0x1e9/0x328
Oct  1 09:40:58 cluster kernel:  [<ffffffff80020ff4>] sync_sb_inodes+0x1b5/0x26f
Oct  1 09:40:58 cluster kernel:  [<ffffffff800a07c0>] keventd_create_kthread+0x0/0xc4
Oct  1 09:40:58 cluster kernel:  [<ffffffff80050ffd>] writeback_inodes+0x82/0xd8
Oct  1 09:40:58 cluster kernel:  [<ffffffff800c9714>] wb_kupdate+0xd4/0x14e
Oct  1 09:40:58 cluster kernel:  [<ffffffff8005663c>] pdflush+0x0/0x1fb
Oct  1 09:40:58 cluster kernel:  [<ffffffff8005678d>] pdflush+0x151/0x1fb
Oct  1 09:40:58 cluster kernel:  [<ffffffff800c9640>] wb_kupdate+0x0/0x14e
Oct  1 09:40:58 cluster kernel:  [<ffffffff8003290a>] kthread+0xfe/0x132
Oct  1 09:40:58 cluster kernel:  [<ffffffff8005dfb1>] child_rip+0xa/0x11
Oct  1 09:40:58 cluster kernel:  [<ffffffff800a07c0>] keventd_create_kthread+0x0/0xc4
Oct  1 09:40:58 cluster kernel:  [<ffffffff8003280c>] kthread+0x0/0x132
Oct  1 09:40:58 cluster kernel:  [<ffffffff8005dfa7>] child_rip+0x0/0x11
Oct  1 09:40:58 cluster kernel:
Oct  1 09:40:58 cluster kernel: INFO: task gfs2_quotad:5142 blocked for more than 120 seconds.
Oct  1 09:40:59 cluster kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct  1 09:40:59 cluster kernel: gfs2_quotad   D ffffffff80150839     0  5142    135          5150  5141 (L-TLB)
Oct  1 09:40:59 cluster kernel:  ffff81010dcefcd0 0000000000000046 0000000000000018 ffffffff8863b4f3
Oct  1 09:40:59 cluster kernel:  0000000000000286 000000000000000a ffff81021fe02820 ffff8101239bb820
Oct  1 09:40:59 cluster kernel:  000042fb4fc836cb 0000000000008c76 ffff81021fe02a08 000000098863ce5a
Oct  1 09:41:00 cluster kernel: Call Trace:
Oct  1 09:41:00 cluster kernel:  [<ffffffff8863b4f3>] :dlm:request_lock+0x93/0xa0
Oct  1 09:41:00 cluster kernel:  [<ffffffff88666ee7>] :gfs2:just_schedule+0x0/0xe
Oct  1 09:41:00 cluster kernel:  [<ffffffff88666ef0>] :gfs2:just_schedule+0x9/0xe
Oct  1 09:41:00 cluster kernel:  [<ffffffff80063a16>] __wait_on_bit+0x40/0x6e
Oct  1 09:41:00 cluster kernel:  [<ffffffff88666ee7>] :gfs2:just_schedule+0x0/0xe
Oct  1 09:41:00 cluster kernel:  [<ffffffff80063ab0>] out_of_line_wait_on_bit+0x6c/0x78
Oct  1 09:41:00 cluster kernel:  [<ffffffff800a0a06>] wake_bit_function+0x0/0x23
Oct  1 09:41:00 cluster kernel:  [<ffffffff88666ee2>] :gfs2:gfs2_glock_wait+0x2b/0x30
Oct  1 09:41:01 cluster kernel:  [<ffffffff8867d5f2>] :gfs2:gfs2_statfs_sync+0x3f/0x165
Oct  1 09:41:01 cluster kernel:  [<ffffffff8867d5ea>] :gfs2:gfs2_statfs_sync+0x37/0x165
Oct  1 09:41:01 cluster kernel:  [<ffffffff8005b97e>] del_timer_sync+0xc/0x16
Oct  1 09:41:01 cluster kernel:  [<ffffffff886773e3>] :gfs2:quotad_check_timeo+0x20/0x60
Oct  1 09:41:01 cluster kernel:  [<ffffffff88678ee0>] :gfs2:gfs2_quotad+0xde/0x214
Oct  1 09:41:01 cluster kernel:  [<ffffffff800a09d8>] autoremove_wake_function+0x0/0x2e
Oct  1 09:41:01 cluster kernel:  [<ffffffff88678e02>] :gfs2:gfs2_quotad+0x0/0x214
Oct  1 09:41:01 cluster kernel:  [<ffffffff800a07c0>] keventd_create_kthread+0x0/0xc4
Oct  1 09:41:01 cluster kernel:  [<ffffffff8003290a>] kthread+0xfe/0x132
Oct  1 09:41:01 cluster kernel:  [<ffffffff8005dfb1>] child_rip+0xa/0x11
Oct  1 09:41:01 cluster kernel:  [<ffffffff800a07c0>] keventd_create_kthread+0x0/0xc4
Oct  1 09:41:01 cluster kernel:  [<ffffffff8003280c>] kthread+0x0/0x132
Oct  1 09:41:01 cluster kernel:  [<ffffffff8005dfa7>] child_rip+0x0/0x11
Oct  1 09:41:01 cluster kernel:
Oct  1 09:41:01 cluster kernel: INFO: task gfs2_quotad:5166 blocked for more than 120 seconds.
Oct  1 09:41:01 cluster kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct  1 09:41:01 cluster kernel: gfs2_quotad   D ffffffff80150839     0  5166    135          5174  5165 (L-TLB)
Oct  1 09:41:01 cluster kernel:  ffff81021ad53cd0 0000000000000046 0000000000000018 ffffffff8863b4f3
Oct  1 09:41:01 cluster kernel:  0000000000000286 000000000000000a ffff81011be77080 ffff8101239bb820
Oct  1 09:41:01 cluster kernel:  0000431153553f46 0000000000008079 ffff81011be77268 000000098863ce5a
Oct  1 09:41:01 cluster kernel: Call Trace:
Oct  1 09:41:01 cluster kernel:  [<ffffffff8863b4f3>] :dlm:request_lock+0x93/0xa0
Oct  1 09:41:01 cluster kernel:  [<ffffffff88666ee7>] :gfs2:just_schedule+0x0/0xe
Oct  1 09:41:01 cluster kernel:  [<ffffffff88666ef0>] :gfs2:just_schedule+0x9/0xe
Oct  1 09:41:01 cluster kernel:  [<ffffffff80063a16>] __wait_on_bit+0x40/0x6e
Oct  1 09:41:01 cluster kernel:  [<ffffffff88666ee7>] :gfs2:just_schedule+0x0/0xe
Oct  1 09:41:01 cluster kernel:  [<ffffffff80063ab0>] out_of_line_wait_on_bit+0x6c/0x78
Oct  1 09:41:01 cluster kernel:  [<ffffffff800a0a06>] wake_bit_function+0x0/0x23
Oct  1 09:41:02 cluster kernel:  [<ffffffff88666ee2>] :gfs2:gfs2_glock_wait+0x2b/0x30
Oct  1 09:41:02 cluster kernel:  [<ffffffff8867d5f2>] :gfs2:gfs2_statfs_sync+0x3f/0x165
Oct  1 09:41:02 cluster kernel:  [<ffffffff8867d5ea>] :gfs2:gfs2_statfs_sync+0x37/0x165
Oct  1 09:41:02 cluster kernel:  [<ffffffff8005b97e>] del_timer_sync+0xc/0x16
Oct  1 09:41:02 cluster kernel:  [<ffffffff886773e3>] :gfs2:quotad_check_timeo+0x20/0x60
Oct  1 09:41:02 cluster kernel:  [<ffffffff88678ee0>] :gfs2:gfs2_quotad+0xde/0x214
Oct  1 09:41:02 cluster kernel:  [<ffffffff800a09d8>] autoremove_wake_function+0x0/0x2e
Oct  1 09:41:02 cluster kernel:  [<ffffffff88678e02>] :gfs2:gfs2_quotad+0x0/0x214
Oct  1 09:41:02 cluster kernel:  [<ffffffff800a07c0>] keventd_create_kthread+0x0/0xc4
Oct  1 09:41:02 cluster kernel:  [<ffffffff8003290a>] kthread+0xfe/0x132
Oct  1 09:41:02 cluster kernel:  [<ffffffff8005dfb1>] child_rip+0xa/0x11
Oct  1 09:41:02 cluster kernel:  [<ffffffff800a07c0>] keventd_create_kthread+0x0/0xc4
Oct  1 09:41:02 cluster kernel:  [<ffffffff8003280c>] kthread+0x0/0x132
Oct  1 09:41:02 cluster kernel:  [<ffffffff8005dfa7>] child_rip+0x0/0x11
Oct  1 09:41:02 cluster kernel:
Oct  1 09:41:02 cluster kernel: INFO: task bash:16809 blocked for more than 120 seconds.
Oct  1 09:41:02 cluster kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct  1 09:41:02 cluster kernel: bash          D ffffffff80150839     0 16809  16808                     (NOTLB)
Oct  1 09:41:02 cluster kernel:  ffff810109aedcf8 0000000000000082 ffff81011e76dc48 ffff810109aedc68
Oct  1 09:41:02 cluster kernel:  ffff81011e76dc48 000000000000000a ffff81010a547860 ffff81021feda860
Oct  1 09:41:02 cluster kernel:  000043291ba0b196 000000000002e415 ffff81010a547a48 00000006000041a9
Oct  1 09:41:02 cluster kernel: Call Trace:
Oct  1 09:41:02 cluster kernel:  [<ffffffff88666ee7>] :gfs2:just_schedule+0x0/0xe
Oct  1 09:41:02 cluster kernel:  [<ffffffff88666ef0>] :gfs2:just_schedule+0x9/0xe
Oct  1 09:41:02 cluster kernel:  [<ffffffff80063a16>] __wait_on_bit+0x40/0x6e
Oct  1 09:41:02 cluster kernel:  [<ffffffff88666ee7>] :gfs2:just_schedule+0x0/0xe
Oct  1 09:41:02 cluster kernel:  [<ffffffff80063ab0>] out_of_line_wait_on_bit+0x6c/0x78
Oct  1 09:41:02 cluster kernel:  [<ffffffff800a0a06>] wake_bit_function+0x0/0x23
Oct  1 09:41:02 cluster kernel:  [<ffffffff88666ee2>] :gfs2:gfs2_glock_wait+0x2b/0x30
Oct  1 09:41:02 cluster kernel:  [<ffffffff8867526e>] :gfs2:gfs2_getattr+0x85/0xc4
Oct  1 09:41:03 cluster kernel:  [<ffffffff88675266>] :gfs2:gfs2_getattr+0x7d/0xc4
Oct  1 09:41:03 cluster kernel:  [<ffffffff8000e390>] vfs_getattr+0x2d/0xa9
Oct  1 09:41:03 cluster kernel:  [<ffffffff800288ec>] vfs_stat_fd+0x32/0x4a
Oct  1 09:41:03 cluster kernel:  [<ffffffff8000e35e>] current_fs_time+0x3b/0x40
Oct  1 09:41:03 cluster kernel:  [<ffffffff800236d3>] sys_newstat+0x19/0x31
Oct  1 09:41:03 cluster kernel:  [<ffffffff8005d229>] tracesys+0x71/0xe0
Oct  1 09:41:03 cluster kernel:  [<ffffffff8005d28d>] tracesys+0xd5/0xe0
Oct  1 09:41:03 cluster kernel:
Oct  1 09:41:03 cluster kernel: INFO: task vim:1253 blocked for more than 120 seconds.
Oct  1 09:41:03 cluster kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct  1 09:41:03 cluster kernel: vim           D ffffffff80150839     0  1253   5909                     (NOTLB)
Oct  1 09:41:03 cluster kernel:  ffff810023fd9ae8 0000000000000086 0000000000000018 ffffffff8863b4f3
Oct  1 09:41:03 cluster kernel:  0000000000000296 0000000000000008 ffff8100cfb690c0 ffff8101239c7860
Oct  1 09:41:03 cluster kernel:  0000431a30340b0d 000000000000591c ffff8100cfb692a8 0000000a8863ce5a
Oct  1 09:41:03 cluster kernel: Call Trace:
Oct  1 09:41:03 cluster kernel:  [<ffffffff8863b4f3>] :dlm:request_lock+0x93/0xa0
Oct  1 09:41:03 cluster kernel:  [<ffffffff88666ee7>] :gfs2:just_schedule+0x0/0xe
Oct  1 09:41:03 cluster kernel:  [<ffffffff88666ef0>] :gfs2:just_schedule+0x9/0xe
Oct  1 09:41:03 cluster kernel:  [<ffffffff80063a16>] __wait_on_bit+0x40/0x6e
Oct  1 09:41:03 cluster kernel:  [<ffffffff88666ee7>] :gfs2:just_schedule+0x0/0xe
Oct  1 09:41:03 cluster kernel:  [<ffffffff80063ab0>] out_of_line_wait_on_bit+0x6c/0x78
Oct  1 09:41:03 cluster kernel:  [<ffffffff800a0a06>] wake_bit_function+0x0/0x23
Oct  1 09:41:03 cluster kernel:  [<ffffffff88666ee2>] :gfs2:gfs2_glock_wait+0x2b/0x30
Oct  1 09:41:03 cluster kernel:  [<ffffffff886686b2>] :gfs2:gfs2_glock_nq_num+0x43/0x68
Oct  1 09:41:03 cluster kernel:  [<ffffffff8866aa13>] :gfs2:gfs2_createi+0x559/0xd28
Oct  1 09:41:03 cluster kernel:  [<ffffffff88661be6>] :gfs2:gfs2_dirent_find+0x0/0x4e
Oct  1 09:41:04 cluster kernel:  [<ffffffff8866143f>] :gfs2:gfs2_dirent_scan+0xb1/0x175
Oct  1 09:41:04 cluster kernel:  [<ffffffff88661be6>] :gfs2:gfs2_dirent_find+0x0/0x4e
Oct  1 09:41:04 cluster kernel:  [<ffffffff8002d0ab>] wake_up_bit+0x11/0x22
Oct  1 09:41:04 cluster kernel:  [<ffffffff88675d7d>] :gfs2:gfs2_create+0x65/0x143
Oct  1 09:41:04 cluster kernel:  [<ffffffff8866a51d>] :gfs2:gfs2_createi+0x63/0xd28
Oct  1 09:41:04 cluster kernel:  [<ffffffff886686aa>] :gfs2:gfs2_glock_nq_num+0x3b/0x68
Oct  1 09:41:04 cluster kernel:  [<ffffffff8003a579>] vfs_create+0xe6/0x158
Oct  1 09:41:04 cluster kernel:  [<ffffffff8001b0d9>] open_namei+0x19d/0x6d5
Oct  1 09:41:04 cluster kernel:  [<ffffffff80027533>] do_filp_open+0x1c/0x38
Oct  1 09:41:04 cluster kernel:  [<ffffffff80019e5d>] do_sys_open+0x44/0xbe
Oct  1 09:41:04 cluster kernel:  [<ffffffff8005d28d>] tracesys+0xd5/0xe0
Oct  1 09:41:04 cluster kernel:
 
Last 100 lines of Glocks from debugfs:
 
I: n:48143/5366392 t:8 f:0x10 d:0x00000000 s:47/47
G:  s:SH n:2/70dae57 f: t:SH d:EX/0 l:0 a:0 r:3
I: n:23069129/118337111 t:8 f:0x10 d:0x00000000 s:8643/8643
G:  s:SH n:2/3bc35c f: t:SH d:EX/0 l:0 a:0 r:3
I: n:274928/3916636 t:8 f:0x10 d:0x00000000 s:0/0
G:  s:SH n:5/1c6d09b f: t:SH d:EX/0 l:0 a:0 r:3
H: s:SH f:EH e:0 p:12552 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:SH n:2/3b5b88 f: t:SH d:EX/0 l:0 a:0 r:3
I: n:274507/3890056 t:8 f:0x10 d:0x00000000 s:0/0
G:  s:SH n:5/5922fe f: t:SH d:EX/0 l:0 a:0 r:3
H: s:SH f:EH e:0 p:12552 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:SH n:2/39aa7d f: t:SH d:EX/0 l:0 a:0 r:3
I: n:274281/3779197 t:8 f:0x10 d:0x00000000 s:0/0
G:  s:SH n:2/1cbe00e f: t:SH d:EX/0 l:0 a:0 r:3
I: n:202021/30138382 t:8 f:0x10 d:0x00000000 s:3382898/3382898
G:  s:SH n:2/3782e2 f: t:SH d:EX/0 l:0 a:0 r:3
I: n:270652/3637986 t:8 f:0x10 d:0x00000000 s:0/0
G:  s:UN n:3/2d58fff0 f: t:UN d:EX/0 l:0 a:0 r:2
G:  s:SH n:2/63703a f: t:SH d:EX/0 l:0 a:0 r:3
I: n:270317/6516794 t:8 f:0x10 d:0x00000000 s:0/0
G:  s:SH n:5/82aa637 f: t:SH d:EX/0 l:0 a:0 r:2
G:  s:SH n:5/62f0f7 f: t:SH d:EX/0 l:0 a:0 r:3
H: s:SH f:EH e:0 p:12552 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:UN n:3/23963a82 f: t:UN d:EX/0 l:0 a:0 r:2
G:  s:SH n:5/466b1fd f: t:SH d:EX/0 l:0 a:0 r:3
H: s:SH f:EH e:0 p:12552 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:SH n:5/3bfec3 f: t:SH d:EX/0 l:0 a:0 r:3
H: s:SH f:EH e:0 p:12552 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:UN n:3/31f9e42a f: t:UN d:EX/0 l:0 a:0 r:2
G:  s:SH n:5/3bd106 f: t:SH d:EX/0 l:0 a:0 r:3
H: s:SH f:EH e:0 p:12552 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:SH n:5/2119d61 f: t:SH d:EX/0 l:0 a:0 r:3
H: s:SH f:EH e:0 p:12552 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:SH n:2/1fe906a f: t:SH d:EX/0 l:0 a:0 r:3
I: n:204159/33460330 t:8 f:0x10 d:0x00000000 s:7989030/7989030
G:  s:SH n:2/62b900 f: t:SH d:EX/0 l:0 a:0 r:3
I: n:269946/6469888 t:8 f:0x10 d:0x00000000 s:482/482
G:  s:SH n:5/618b67 f: t:SH d:EX/0 l:0 a:0 r:3
H: s:SH f:EH e:0 p:12552 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:UN n:3/2a2c12fe f: t:UN d:EX/0 l:0 a:0 r:2
G:  s:SH n:5/45a3c6 f: t:SH d:EX/0 l:0 a:0 r:3
H: s:SH f:EH e:0 p:12552 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:UN n:3/3d6c9f78 f: t:UN d:EX/0 l:0 a:0 r:2
G:  s:SH n:5/1e5e0d5 f: t:SH d:EX/0 l:0 a:0 r:3
H: s:SH f:EH e:0 p:12552 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:SH n:5/636b56 f: t:SH d:EX/0 l:0 a:0 r:3
H: s:SH f:EH e:0 p:12552 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:UN n:2/82aa508 f: t:UN d:EX/0 l:0 a:0 r:2
G:  s:SH n:5/3beddb f: t:SH d:EX/0 l:0 a:0 r:3
H: s:SH f:EH e:0 p:12552 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:SH n:2/3784d4 f: t:SH d:EX/0 l:0 a:0 r:3
I: n:270761/3638484 t:8 f:0x10 d:0x00000000 s:16221/16221
G:  s:UN n:3/b70bc2 f: t:UN d:EX/0 l:0 a:0 r:2
G:  s:SH n:5/773a87 f: t:SH d:EX/0 l:0 a:0 r:3
H: s:SH f:EH e:0 p:12552 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:SH n:2/152c4791 f: t:SH d:EX/0 l:0 a:0 r:3
I: n:15729162/355223441 t:8 f:0x10 d:0x00000000 s:21011/21011
G:  s:SH n:2/1c22460 f: t:SH d:EX/0 l:0 a:0 r:3
I: n:200173/29500512 t:8 f:0x10 d:0x00000000 s:200665/200665
G:  s:SH n:5/37aa41 f: t:SH d:EX/0 l:0 a:0 r:3
H: s:SH f:EH e:0 p:12552 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:SH n:2/380ec6 f: t:SH d:EX/0 l:0 a:0 r:3
I: n:273638/3673798 t:8 f:0x10 d:0x00000000 s:126106/126106
G:  s:UN n:3/546d1572 f: t:UN d:EX/0 l:0 a:0 r:2
G:  s:SH n:5/c50c9f8 f: t:SH d:EX/0 l:0 a:0 r:3
H: s:SH f:EH e:0 p:12552 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:UN n:3/1a967082 f: t:UN d:EX/0 l:0 a:0 r:2
G:  s:SH n:5/3fac9d f: t:SH d:EX/0 l:0 a:0 r:3
H: s:SH f:EH e:0 p:12552 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:SH n:5/82ab6ac f: t:SH d:EX/0 l:0 a:0 r:3
H: s:SH f:EH e:0 p:25498 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:SH n:2/37fb9a f: t:SH d:EX/0 l:0 a:0 r:3
I: n:273471/3668890 t:8 f:0x10 d:0x00000000 s:0/0
G:  s:UN n:3/406c8d78 f: t:UN d:EX/0 l:0 a:0 r:2
G:  s:SH n:5/1c52930 f: t:SH d:EX/0 l:0 a:0 r:3
H: s:SH f:EH e:0 p:12552 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:SH n:2/382499 f: t:SH d:EX/0 l:0 a:0 r:3
I: n:273823/3679385 t:8 f:0x10 d:0x00000000 s:0/0
G:  s:SH n:5/1f460d6 f: t:SH d:EX/0 l:0 a:0 r:3
H: s:SH f:EH e:0 p:12552 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:SH n:2/2118f03 f: t:SH d:EX/0 l:0 a:0 r:3
I: n:205534/34705155 t:8 f:0x10 d:0x00000000 s:8221/8221
G:  s:SH n:5/37ae06 f: t:SH d:EX/0 l:0 a:0 r:3
H: s:SH f:EH e:0 p:12552 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:SH n:5/20f3005 f: t:SH d:EX/0 l:0 a:0 r:3
H: s:SH f:EH e:0 p:12552 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:UN n:3/3ba7aa16 f: t:UN d:EX/0 l:0 a:0 r:2
G:  s:UN n:3/ce3c2b4 f: t:UN d:EX/0 l:0 a:0 r:2
G:  s:SH n:5/443189 f: t:SH d:EX/0 l:0 a:0 r:3
H: s:SH f:EH e:0 p:12552 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:SH n:2/2119a33 f: t:SH d:EX/0 l:0 a:0 r:3
I: n:205611/34708019 t:8 f:0x10 d:0x00000000 s:17664/17664
G:  s:SH n:5/217bcb1 f: t:SH d:EX/0 l:0 a:0 r:3
H: s:SH f:EH e:0 p:12552 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:SH n:2/61d6cd f: t:SH d:EX/0 l:0 a:0 r:3
I: n:269030/6411981 t:8 f:0x10 d:0x00000000 s:658/658
G:  s:SH n:5/82aac9f f: t:SH d:EX/0 l:0 a:0 r:3
H: s:SH f:EH e:0 p:24366 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:SH n:5/3a6360 f: t:SH d:EX/0 l:0 a:0 r:3
H: s:SH f:EH e:0 p:12552 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
 
Thanks.
 
--
Jason Giangrande
System Administrator
Clark University
 
 
 
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster

[Index of Archives]     [Corosync Cluster Engine]     [GFS]     [Linux Virtualization]     [Centos Virtualization]     [Centos]     [Linux RAID]     [Fedora Users]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite Camping]

  Powered by Linux