As a follow-up, I see these kernel messages in my logs on both nodes while I'm testing with bonnie++: Feb 29 12:51:30 node1 kernel: INFO: task bonnie++:2536 blocked for more than 120 seconds. Feb 29 12:51:30 node1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Feb 29 12:51:30 node1 kernel: bonnie++ D 0000000000000000 0 2536 1445 0x00000080 Feb 29 12:51:30 node1 kernel: ffff8801af87d910 0000000000000082 0000000000000000 0000000000000002 Feb 29 12:51:30 node1 kernel: 0000000000000246 ffff8801af87d8b8 ffff8801af87d918 ffff8800280399b8 Feb 29 12:51:30 node1 kernel: ffff880234cbc6b8 ffff8801af87dfd8 000000000000f4e8 ffff880234cbc6b8 Feb 29 12:51:30 node1 kernel: Call Trace: Feb 29 12:51:30 node1 kernel: [<ffffffff814edb2e>] ? __wait_on_bit+0x7e/0x90 Feb 29 12:51:30 node1 kernel: [<ffffffff814eef25>] rwsem_down_failed_common+0x95/0x1d0 Feb 29 12:51:30 node1 kernel: [<ffffffff814ef0b6>] rwsem_down_read_failed+0x26/0x30 Feb 29 12:51:30 node1 kernel: [<ffffffffa03deb8b>] ? do_promote+0x14b/0x340 [gfs2] Feb 29 12:51:30 node1 kernel: [<ffffffff81276d54>] call_rwsem_down_read_failed+0x14/0x30 Feb 29 12:51:30 node1 kernel: [<ffffffff814ee5b4>] ? down_read+0x24/0x30 Feb 29 12:51:30 node1 kernel: [<ffffffffa03e3ecc>] gfs2_log_reserve+0xfc/0x190 [gfs2] Feb 29 12:51:30 node1 kernel: [<ffffffffa03fb5e2>] gfs2_trans_begin+0x112/0x1d0 [gfs2] Feb 29 12:51:30 node1 kernel: [<ffffffffa03e74db>] gfs2_write_begin+0x18b/0x4e0 [gfs2] Feb 29 12:51:30 node1 kernel: [<ffffffff811112ae>] generic_file_buffered_write+0x10e/0x2a0 Feb 29 12:51:30 node1 kernel: [<ffffffff81070637>] ? current_fs_time+0x27/0x30 Feb 29 12:51:30 node1 kernel: [<ffffffff81190171>] ? file_update_time+0x1/0x170 Feb 29 12:51:30 node1 kernel: [<ffffffff81112c00>] __generic_file_aio_write+0x250/0x480 Feb 29 12:51:30 node1 kernel: [<ffffffff8100bc0e>] ? apic_timer_interrupt+0xe/0x20 Feb 29 12:51:30 node1 kernel: [<ffffffff8100bc0e>] ? apic_timer_interrupt+0xe/0x20 Feb 29 12:51:30 node1 kernel: [<ffffffff8100bc0e>] ? apic_timer_interrupt+0xe/0x20 Feb 29 12:51:30 node1 kernel: [<ffffffff81112e9f>] generic_file_aio_write+0x6f/0xe0 Feb 29 12:51:30 node1 kernel: [<ffffffffa03e981e>] gfs2_file_aio_write+0x7e/0xb0 [gfs2] Feb 29 12:51:30 node1 kernel: [<ffffffff8100bc0e>] ? apic_timer_interrupt+0xe/0x20 Feb 29 12:51:30 node1 kernel: [<ffffffff81176080>] ? do_sync_write+0x0/0x140 Feb 29 12:51:30 node1 kernel: [<ffffffff8117617a>] do_sync_write+0xfa/0x140 Feb 29 12:51:30 node1 kernel: [<ffffffff81090a90>] ? autoremove_wake_function+0x0/0x40 Feb 29 12:51:30 node1 kernel: [<ffffffff8100bc0e>] ? apic_timer_interrupt+0xe/0x20 Feb 29 12:51:30 node1 kernel: [<ffffffff8120c0d6>] ? security_file_permission+0x16/0x20 Feb 29 12:51:30 node1 kernel: [<ffffffff81176478>] vfs_write+0xb8/0x1a0 Feb 29 12:51:30 node1 kernel: [<ffffffff810d4582>] ? audit_syscall_entry+0x272/0x2a0 Feb 29 12:51:30 node1 kernel: [<ffffffff81176e81>] sys_write+0x51/0x90 Feb 29 12:51:30 node1 kernel: [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b Feb 29 12:51:30 node1 kernel: INFO: task flush-253:0:2537 blocked for more than 120 seconds. Feb 29 12:51:30 node1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Feb 29 12:51:30 node1 kernel: flush-253:0 D 0000000000000000 0 2537 2 0x00000080 Feb 29 12:51:30 node1 kernel: ffff88023494b9a0 0000000000000046 0000000000000000 ffff88015615b2a0 Feb 29 12:51:30 node1 kernel: ffff88023494b950 ffffea00022c28b8 ffff880082ff9748 ffff880082ff9748 Feb 29 12:51:30 node1 kernel: ffff88023707f038 ffff88023494bfd8 000000000000f4e8 ffff88023707f038 Feb 29 12:51:30 node1 kernel: Call Trace: Feb 29 12:51:30 node1 kernel: [<ffffffffa03dc5a0>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2] Feb 29 12:51:30 node1 kernel: [<ffffffffa03dc5ae>] gfs2_glock_holder_wait+0xe/0x20 [gfs2] Feb 29 12:51:30 node1 kernel: [<ffffffff814edb0f>] __wait_on_bit+0x5f/0x90 Feb 29 12:51:30 node1 kernel: [<ffffffffa03e66a0>] ? gfs2_get_block_noalloc+0x0/0x40 [gfs2] Feb 29 12:51:30 node1 kernel: [<ffffffffa03dc5a0>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2] Feb 29 12:51:30 node1 kernel: [<ffffffff814edbb8>] out_of_line_wait_on_bit+0x78/0x90 Feb 29 12:51:30 node1 kernel: [<ffffffff81090ad0>] ? wake_bit_function+0x0/0x50 Feb 29 12:51:30 node1 kernel: [<ffffffffa03dd195>] gfs2_glock_wait+0x45/0x90 [gfs2] Feb 29 12:51:30 node1 kernel: [<ffffffffa03df820>] gfs2_glock_nq+0x1d0/0x360 [gfs2] Feb 29 12:51:30 node1 kernel: [<ffffffffa03f7b4e>] gfs2_glock_nq_init+0x1e/0x40 [gfs2] Feb 29 12:51:30 node1 kernel: [<ffffffffa03f87de>] gfs2_write_inode+0x28e/0x2f0 [gfs2] Feb 29 12:51:30 node1 kernel: [<ffffffffa03f7b46>] ? gfs2_glock_nq_init+0x16/0x40 [gfs2] Feb 29 12:51:30 node1 kernel: [<ffffffff811a0434>] writeback_single_inode+0x204/0x2c0 Feb 29 12:51:30 node1 kernel: [<ffffffff811a074e>] writeback_sb_inodes+0xce/0x180 Feb 29 12:51:30 node1 kernel: [<ffffffff811a08ab>] writeback_inodes_wb+0xab/0x1b0 Feb 29 12:51:30 node1 kernel: [<ffffffff811a0c4b>] wb_writeback+0x29b/0x3f0 Feb 29 12:51:30 node1 kernel: [<ffffffff814ec9ce>] ? thread_return+0x4e/0x760 Feb 29 12:51:30 node1 kernel: [<ffffffff811a0e5b>] wb_do_writeback+0xbb/0x240 Feb 29 12:51:30 node1 kernel: [<ffffffff811a1043>] bdi_writeback_task+0x63/0x1b0 Feb 29 12:51:30 node1 kernel: [<ffffffff81090957>] ? bit_waitqueue+0x17/0xd0 Feb 29 12:51:30 node1 kernel: [<ffffffff81134be0>] ? bdi_start_fn+0x0/0x100 Feb 29 12:51:30 node1 kernel: [<ffffffff81134c66>] bdi_start_fn+0x86/0x100 Feb 29 12:51:30 node1 kernel: [<ffffffff81134be0>] ? bdi_start_fn+0x0/0x100 Feb 29 12:51:30 node1 kernel: [<ffffffff81090726>] kthread+0x96/0xa0 Feb 29 12:51:30 node1 kernel: [<ffffffff8100c14a>] child_rip+0xa/0x20 Feb 29 12:51:30 node1 kernel: [<ffffffff81090690>] ? kthread+0x0/0xa0 Feb 29 12:51:30 node1 kernel: [<ffffffff8100c140>] ? child_rip+0x0/0x20 -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster