On Sep 21, 2012, at 10:34 AM, Steven Whitehouse wrote: > Hi, > > On Thu, 2012-09-20 at 16:25 +0200, Andrew Holway wrote: >> It seems that my node004 is the problem. >> >> I cannot kill the iozone processes and I find this in the logs. >> > This looks like there is some problem with the i/o stack below the level > of GFS2. What kind of storage are you using? If this is a JBOD then > perhaps there is a faulty disk or something like that? Why do you say that? It did it again. but I have no indication from my storage brick that I have an issue. It does appear that it was the same node (node004) that caused the issue again. The other three stopped doing IO for some time and then resumed. The node004 died completely I will run with loglevel=TRACE now. Thanks, Andrew node004 messages Sep 21 11:28:50 node004 dlm_controld[22407]: dlm_controld 3.0.12.1 started Sep 21 11:28:51 node004 gfs_controld[22456]: gfs_controld 3.0.12.1 started Sep 21 11:28:59 node004 kernel: dlm: Using TCP for communications Sep 21 11:29:00 node004 clvmd: Cluster LVM daemon started - connected to CMAN Sep 21 11:29:00 node004 kernel: dlm: connecting to 2 Sep 21 11:29:00 node004 kernel: dlm: connecting to 1 Sep 21 11:29:01 node004 kernel: dlm: connecting to 3 Sep 21 11:30:06 node004 kernel: GFS2 (built Jun 22 2012 12:21:46) installed Sep 21 11:30:06 node004 kernel: GFS2: fsid=: Trying to join cluster "lock_dlm", "nimble_cluster:gfs_test" Sep 21 11:30:06 node004 kernel: GFS2: fsid=nimble_cluster:gfs_test.2: Joined cluster. Now mounting FS... Sep 21 11:30:07 node004 kernel: GFS2: fsid=nimble_cluster:gfs_test.2: jid=2, already locked for use Sep 21 11:30:07 node004 kernel: GFS2: fsid=nimble_cluster:gfs_test.2: jid=2: Looking at journal... Sep 21 11:30:07 node004 kernel: GFS2: fsid=nimble_cluster:gfs_test.2: jid=2: Done Sep 21 11:30:10 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[4d 00 40 00 00 00 00 00 04 00 00 00 00 00 00 00] Sep 21 11:30:10 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[37 00 0c 00 00 00 00 00 04 00 00 00 00 00 00 00] Sep 21 11:30:10 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[4d 00 40 00 00 00 00 00 04 00 00 00 00 00 00 00] Sep 21 11:30:10 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[37 00 0c 00 00 00 00 00 04 00 00 00 00 00 00 00] Sep 21 11:38:24 node004 xinetd[5015]: START: node-state pid=23019 from=::ffff:10.141.255.254 Sep 21 11:38:24 node004 xinetd[5015]: EXIT: node-state status=0 pid=23019 duration=0(sec) Sep 21 11:39:23 node004 xinetd[5015]: START: node-state pid=23038 from=::ffff:10.141.255.254 Sep 21 11:39:23 node004 xinetd[5015]: EXIT: node-state status=0 pid=23038 duration=0(sec) Sep 21 11:39:25 node004 xinetd[5015]: START: node-state pid=23057 from=::ffff:10.141.255.254 Sep 21 11:39:25 node004 xinetd[5015]: EXIT: node-state status=0 pid=23057 duration=0(sec) Sep 21 11:39:40 node004 xinetd[5015]: START: node-state pid=23075 from=::ffff:10.141.255.254 Sep 21 11:39:40 node004 xinetd[5015]: EXIT: node-state status=0 pid=23075 duration=0(sec) Sep 21 11:39:45 node004 xinetd[5015]: START: node-state pid=23097 from=::ffff:10.141.255.254 Sep 21 11:39:45 node004 xinetd[5015]: EXIT: node-state status=0 pid=23097 duration=0(sec) Sep 21 11:39:53 node004 xinetd[5015]: START: node-state pid=23119 from=::ffff:10.141.255.254 Sep 21 11:39:53 node004 xinetd[5015]: EXIT: node-state status=0 pid=23119 duration=0(sec) Sep 21 11:39:54 node004 rpc.statd[23170]: Version 1.2.3 starting Sep 21 11:39:54 node004 sm-notify[23171]: Version 1.2.3 starting Sep 21 11:40:50 node004 rpc.statd[23215]: Version 1.2.3 starting Sep 21 11:40:50 node004 sm-notify[23216]: Version 1.2.3 starting Sep 21 12:00:10 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[4d 00 40 00 00 00 00 00 04 00 00 00 00 00 00 00] Sep 21 12:00:10 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[37 00 0c 00 00 00 00 00 04 00 00 00 00 00 00 00] Sep 21 12:00:10 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[4d 00 40 00 00 00 00 00 04 00 00 00 00 00 00 00] Sep 21 12:00:10 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[37 00 0c 00 00 00 00 00 04 00 00 00 00 00 00 00] Sep 21 12:30:10 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[4d 00 40 00 00 00 00 00 04 00 00 00 00 00 00 00] Sep 21 12:30:10 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[37 00 0c 00 00 00 00 00 04 00 00 00 00 00 00 00] Sep 21 12:30:10 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[4d 00 40 00 00 00 00 00 04 00 00 00 00 00 00 00] Sep 21 12:30:10 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[37 00 0c 00 00 00 00 00 04 00 00 00 00 00 00 00] Sep 21 12:54:43 node004 kernel: INFO: task iozone:23804 blocked for more than 120 seconds. Sep 21 12:54:43 node004 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Sep 21 12:54:43 node004 kernel: iozone D 0000000000000011 0 23804 22911 0x00000080 Sep 21 12:54:43 node004 kernel: ffff880dd06db958 0000000000000086 0000000000000000 ffffffffa01bf1fc Sep 21 12:54:43 node004 kernel: ffff880dd06db928 000000004280d602 0000000000000000 ffff880ff308f380 Sep 21 12:54:43 node004 kernel: ffff88100c25b058 ffff880dd06dbfd8 000000000000fb88 ffff88100c25b058 Sep 21 12:54:43 node004 kernel: Call Trace: Sep 21 12:54:43 node004 kernel: [<ffffffffa01bf1fc>] ? dm_table_unplug_all+0x5c/0x100 [dm_mod] Sep 21 12:54:43 node004 kernel: [<ffffffff814fdfc3>] io_schedule+0x73/0xc0 Sep 21 12:54:43 node004 kernel: [<ffffffff811b663e>] __blockdev_direct_IO_newtrunc+0x6fe/0xb90 Sep 21 12:54:43 node004 kernel: [<ffffffffa0990ca8>] ? do_promote+0x208/0x330 [gfs2] Sep 21 12:54:43 node004 kernel: [<ffffffff811b6b2e>] __blockdev_direct_IO+0x5e/0xd0 Sep 21 12:54:43 node004 kernel: [<ffffffffa0998760>] ? gfs2_get_block_direct+0x0/0x20 [gfs2] Sep 21 12:54:43 node004 kernel: [<ffffffffa0998990>] gfs2_direct_IO+0x100/0x110 [gfs2] Sep 21 12:54:43 node004 kernel: [<ffffffffa0998760>] ? gfs2_get_block_direct+0x0/0x20 [gfs2] Sep 21 12:54:43 node004 kernel: [<ffffffffa09988ec>] ? gfs2_direct_IO+0x5c/0x110 [gfs2] Sep 21 12:54:43 node004 kernel: [<ffffffff81114d32>] generic_file_direct_write+0xc2/0x190 Sep 21 12:54:43 node004 kernel: [<ffffffff81116545>] __generic_file_aio_write+0x345/0x480 Sep 21 12:54:43 node004 kernel: [<ffffffff8100bdae>] ? call_function_single_interrupt+0xe/0x20 Sep 21 12:54:43 node004 kernel: [<ffffffff8100bdae>] ? call_function_single_interrupt+0xe/0x20 Sep 21 12:54:43 node004 kernel: [<ffffffff8100ba4e>] ? common_interrupt+0xe/0x13 Sep 21 12:54:43 node004 kernel: [<ffffffff811166ef>] generic_file_aio_write+0x6f/0xe0 Sep 21 12:54:43 node004 kernel: [<ffffffffa099b8be>] gfs2_file_aio_write+0x7e/0xb0 [gfs2] Sep 21 12:54:43 node004 kernel: [<ffffffff8100bdae>] ? call_function_single_interrupt+0xe/0x20 Sep 21 12:54:43 node004 kernel: [<ffffffff8117ac70>] ? do_sync_write+0x0/0x140 Sep 21 12:54:43 node004 kernel: [<ffffffff8117ad6a>] do_sync_write+0xfa/0x140 Sep 21 12:54:43 node004 kernel: [<ffffffff810920d0>] ? autoremove_wake_function+0x0/0x40 Sep 21 12:54:43 node004 kernel: [<ffffffff81179b51>] ? generic_file_llseek_unlocked+0x1/0x80 Sep 21 12:54:43 node004 kernel: [<ffffffff8121fd8b>] ? selinux_file_permission+0xfb/0x150 Sep 21 12:54:43 node004 kernel: [<ffffffff81213136>] ? security_file_permission+0x16/0x20 Sep 21 12:54:43 node004 kernel: [<ffffffff8117b068>] vfs_write+0xb8/0x1a0 Sep 21 12:54:43 node004 kernel: [<ffffffff8117c4b9>] ? fget_light+0x19/0x90 Sep 21 12:54:43 node004 kernel: [<ffffffff8117ba81>] sys_write+0x51/0x90 Sep 21 12:54:43 node004 kernel: [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b Sep 21 12:54:43 node004 kernel: INFO: task iozone:23805 blocked for more than 120 seconds. Sep 21 12:54:43 node004 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Sep 21 12:54:43 node004 kernel: iozone D 0000000000000011 0 23805 22911 0x00000080 Sep 21 12:54:43 node004 kernel: ffff880dd073b958 0000000000000086 0000000000000000 ffffffffa01bf1fc Sep 21 12:54:43 node004 kernel: ffff880dd073b928 000000002c279ff5 0000000000000000 ffff8810048e10c0 Sep 21 12:54:43 node004 kernel: ffff881015abdaf8 ffff880dd073bfd8 000000000000fb88 ffff881015abdaf8 Sep 21 12:54:43 node004 kernel: Call Trace: Sep 21 12:54:43 node004 kernel: [<ffffffffa01bf1fc>] ? dm_table_unplug_all+0x5c/0x100 [dm_mod] Sep 21 12:54:43 node004 kernel: [<ffffffff814fdfc3>] io_schedule+0x73/0xc0 Sep 21 12:54:43 node004 kernel: [<ffffffff811b663e>] __blockdev_direct_IO_newtrunc+0x6fe/0xb90 Sep 21 12:54:43 node004 kernel: [<ffffffffa098e570>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2] Sep 21 12:54:43 node004 kernel: [<ffffffffa0990ca8>] ? do_promote+0x208/0x330 [gfs2] Sep 21 12:54:43 node004 kernel: [<ffffffff811b6b2e>] __blockdev_direct_IO+0x5e/0xd0 Sep 21 12:54:43 node004 kernel: [<ffffffffa0998760>] ? gfs2_get_block_direct+0x0/0x20 [gfs2] Sep 21 12:54:43 node004 kernel: [<ffffffffa0998990>] gfs2_direct_IO+0x100/0x110 [gfs2] Sep 21 12:54:43 node004 kernel: [<ffffffffa0998760>] ? gfs2_get_block_direct+0x0/0x20 [gfs2] Sep 21 12:54:43 node004 kernel: [<ffffffffa09988ec>] ? gfs2_direct_IO+0x5c/0x110 [gfs2] Sep 21 12:54:43 node004 kernel: [<ffffffff81114d32>] generic_file_direct_write+0xc2/0x190 Sep 21 12:54:43 node004 kernel: [<ffffffff81116545>] __generic_file_aio_write+0x345/0x480 Sep 21 12:54:43 node004 kernel: [<ffffffff8100ba4e>] ? common_interrupt+0xe/0x13 Sep 21 12:54:43 node004 kernel: [<ffffffff8100ba4e>] ? common_interrupt+0xe/0x13 Sep 21 12:54:43 node004 kernel: [<ffffffff8100ba4e>] ? common_interrupt+0xe/0x13 Sep 21 12:54:43 node004 kernel: [<ffffffff811166ef>] generic_file_aio_write+0x6f/0xe0 Sep 21 12:54:43 node004 kernel: [<ffffffffa099b8be>] gfs2_file_aio_write+0x7e/0xb0 [gfs2] Sep 21 12:54:43 node004 kernel: [<ffffffff8100ba4e>] ? common_interrupt+0xe/0x13 Sep 21 12:54:43 node004 kernel: [<ffffffff8117ad6a>] do_sync_write+0xfa/0x140 Sep 21 12:54:43 node004 kernel: [<ffffffff810920d0>] ? autoremove_wake_function+0x0/0x40 Sep 21 12:54:43 node004 kernel: [<ffffffff8121fd8b>] ? selinux_file_permission+0xfb/0x150 Sep 21 12:54:43 node004 kernel: [<ffffffff81213136>] ? security_file_permission+0x16/0x20 Sep 21 12:54:43 node004 kernel: [<ffffffff8117b068>] vfs_write+0xb8/0x1a0 Sep 21 12:54:43 node004 kernel: [<ffffffff811937f0>] ? dput+0x0/0x150 Sep 21 12:54:43 node004 kernel: [<ffffffff8117ba81>] sys_write+0x51/0x90 Sep 21 12:54:43 node004 kernel: [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b Sep 21 12:54:43 node004 kernel: INFO: task iozone:23806 blocked for more than 120 seconds. Sep 21 12:54:43 node004 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Sep 21 12:54:43 node004 kernel: iozone D 0000000000000008 0 23806 22911 0x00000080 Sep 21 12:54:43 node004 kernel: ffff880dd070d958 0000000000000086 0000000000000000 ffffffffa01bf1fc Sep 21 12:54:43 node004 kernel: ffff880dd070d928 00000000780f0b6e 0000000000000000 ffff882006c04a40 Sep 21 12:54:43 node004 kernel: ffff88100d04d058 ffff880dd070dfd8 000000000000fb88 ffff88100d04d058 Sep 21 12:54:43 node004 kernel: Call Trace: Sep 21 12:54:43 node004 kernel: [<ffffffffa01bf1fc>] ? dm_table_unplug_all+0x5c/0x100 [dm_mod] Sep 21 12:54:43 node004 kernel: [<ffffffff814fdfc3>] io_schedule+0x73/0xc0 Sep 21 12:54:43 node004 kernel: [<ffffffff811b663e>] __blockdev_direct_IO_newtrunc+0x6fe/0xb90 Sep 21 12:54:43 node004 kernel: [<ffffffffa0990ca8>] ? do_promote+0x208/0x330 [gfs2] Sep 21 12:54:43 node004 kernel: [<ffffffff811b6b2e>] __blockdev_direct_IO+0x5e/0xd0 Sep 21 12:54:43 node004 kernel: [<ffffffffa0998760>] ? gfs2_get_block_direct+0x0/0x20 [gfs2] Sep 21 12:54:43 node004 kernel: [<ffffffffa0998990>] gfs2_direct_IO+0x100/0x110 [gfs2] Sep 21 12:54:43 node004 kernel: [<ffffffffa0998760>] ? gfs2_get_block_direct+0x0/0x20 [gfs2] Sep 21 12:54:43 node004 kernel: [<ffffffffa09988ec>] ? gfs2_direct_IO+0x5c/0x110 [gfs2] Sep 21 12:54:43 node004 kernel: [<ffffffff81114d32>] generic_file_direct_write+0xc2/0x190 Sep 21 12:54:43 node004 kernel: [<ffffffff81116545>] __generic_file_aio_write+0x345/0x480 Sep 21 12:54:43 node004 kernel: [<ffffffff8100bdae>] ? call_function_single_interrupt+0xe/0x20 Sep 21 12:54:43 node004 kernel: [<ffffffff811166ef>] generic_file_aio_write+0x6f/0xe0 Sep 21 12:54:43 node004 kernel: [<ffffffff81116681>] ? generic_file_aio_write+0x1/0xe0 Sep 21 12:54:43 node004 kernel: [<ffffffffa099b8be>] gfs2_file_aio_write+0x7e/0xb0 [gfs2] Sep 21 12:54:43 node004 kernel: [<ffffffff8100ba4e>] ? common_interrupt+0xe/0x13 Sep 21 12:54:43 node004 kernel: [<ffffffff8117ac70>] ? do_sync_write+0x0/0x140 Sep 21 12:54:43 node004 kernel: [<ffffffff8117ad6a>] do_sync_write+0xfa/0x140 Sep 21 12:54:43 node004 kernel: [<ffffffff810920d0>] ? autoremove_wake_function+0x0/0x40 Sep 21 12:54:43 node004 kernel: [<ffffffff8121fd8b>] ? selinux_file_permission+0xfb/0x150 Sep 21 12:54:43 node004 kernel: [<ffffffff81213136>] ? security_file_permission+0x16/0x20 Sep 21 12:54:43 node004 kernel: [<ffffffff8117b068>] vfs_write+0xb8/0x1a0 Sep 21 12:54:43 node004 kernel: [<ffffffff8117ba81>] sys_write+0x51/0x90 Sep 21 12:54:43 node004 kernel: [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b Sep 21 12:54:43 node004 kernel: INFO: task iozone:23807 blocked for more than 120 seconds. Sep 21 12:54:43 node004 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Sep 21 12:54:43 node004 kernel: iozone D 0000000000000011 0 23807 22911 0x00000080 Sep 21 12:54:43 node004 kernel: ffff880dd06ab958 0000000000000082 0000000000000000 ffffffffa01bf1fc Sep 21 12:54:43 node004 kernel: ffff880dd06ab928 00000000b2eadd67 0000000000000000 ffff881004891ec0 Sep 21 12:54:43 node004 kernel: ffff881015ff8638 ffff880dd06abfd8 000000000000fb88 ffff881015ff8638 Sep 21 12:54:43 node004 kernel: Call Trace: Sep 21 12:54:43 node004 kernel: [<ffffffffa01bf1fc>] ? dm_table_unplug_all+0x5c/0x100 [dm_mod] Sep 21 12:54:43 node004 kernel: [<ffffffff814fdfc3>] io_schedule+0x73/0xc0 Sep 21 12:54:43 node004 kernel: [<ffffffff811b663e>] __blockdev_direct_IO_newtrunc+0x6fe/0xb90 Sep 21 12:54:43 node004 kernel: [<ffffffffa098e570>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2] Sep 21 12:54:43 node004 kernel: [<ffffffffa0990ca8>] ? do_promote+0x208/0x330 [gfs2] Sep 21 12:54:43 node004 kernel: [<ffffffff811b6b2e>] __blockdev_direct_IO+0x5e/0xd0 Sep 21 12:54:43 node004 kernel: [<ffffffffa0998760>] ? gfs2_get_block_direct+0x0/0x20 [gfs2] Sep 21 12:54:43 node004 kernel: [<ffffffffa0998990>] gfs2_direct_IO+0x100/0x110 [gfs2] Sep 21 12:54:43 node004 kernel: [<ffffffffa0998760>] ? gfs2_get_block_direct+0x0/0x20 [gfs2] Sep 21 12:54:43 node004 kernel: [<ffffffffa09988ec>] ? gfs2_direct_IO+0x5c/0x110 [gfs2] Sep 21 12:54:43 node004 kernel: [<ffffffff81114d32>] generic_file_direct_write+0xc2/0x190 Sep 21 12:54:43 node004 kernel: [<ffffffff81116545>] __generic_file_aio_write+0x345/0x480 Sep 21 12:54:43 node004 kernel: [<ffffffff810623da>] ? __cond_resched+0x2a/0x40 Sep 21 12:54:43 node004 kernel: [<ffffffff811166ef>] generic_file_aio_write+0x6f/0xe0 Sep 21 12:54:43 node004 kernel: [<ffffffffa099b8be>] gfs2_file_aio_write+0x7e/0xb0 [gfs2] Sep 21 12:54:43 node004 kernel: [<ffffffff8100bc0e>] ? apic_timer_interrupt+0xe/0x20 Sep 21 12:54:43 node004 kernel: [<ffffffff8117ac70>] ? do_sync_write+0x0/0x140 Sep 21 12:54:43 node004 kernel: [<ffffffff8117ad6a>] do_sync_write+0xfa/0x140 Sep 21 12:54:43 node004 kernel: [<ffffffff810920d0>] ? autoremove_wake_function+0x0/0x40 Sep 21 12:54:43 node004 kernel: [<ffffffff811bb89f>] ? inotify_inode_queue_event+0x2f/0x120 Sep 21 12:54:43 node004 kernel: [<ffffffff8100ba4e>] ? common_interrupt+0xe/0x13 Sep 21 12:54:43 node004 kernel: [<ffffffff8121fd8b>] ? selinux_file_permission+0xfb/0x150 Sep 21 12:54:43 node004 kernel: [<ffffffff81213136>] ? security_file_permission+0x16/0x20 Sep 21 12:54:43 node004 kernel: [<ffffffff8117b068>] vfs_write+0xb8/0x1a0 Sep 21 12:54:43 node004 kernel: [<ffffffff8117ba81>] sys_write+0x51/0x90 Sep 21 12:54:43 node004 kernel: [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b Sep 21 12:54:43 node004 kernel: INFO: task iozone:23808 blocked for more than 120 seconds. Sep 21 12:54:43 node004 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Sep 21 12:54:43 node004 kernel: iozone D 0000000000000011 0 23808 22911 0x00000080 Sep 21 12:54:43 node004 kernel: ffff88100eb19958 0000000000000082 0000000000000000 ffffffffa01bf1fc Sep 21 12:54:43 node004 kernel: ffff88100eb19928 00000000291dd6ee 0000000000000000 ffff881004891a40 Sep 21 12:54:43 node004 kernel: ffff88100bb3b098 ffff88100eb19fd8 000000000000fb88 ffff88100bb3b098 Sep 21 12:54:43 node004 kernel: Call Trace: Sep 21 12:54:43 node004 kernel: [<ffffffffa01bf1fc>] ? dm_table_unplug_all+0x5c/0x100 [dm_mod] Sep 21 12:54:43 node004 kernel: [<ffffffff814fdfc3>] io_schedule+0x73/0xc0 Sep 21 12:54:43 node004 kernel: [<ffffffff811b663e>] __blockdev_direct_IO_newtrunc+0x6fe/0xb90 Sep 21 12:54:43 node004 kernel: [<ffffffffa098e570>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2] Sep 21 12:54:43 node004 kernel: [<ffffffffa0990ca8>] ? do_promote+0x208/0x330 [gfs2] Sep 21 12:54:43 node004 kernel: [<ffffffff811b6b2e>] __blockdev_direct_IO+0x5e/0xd0 Sep 21 12:54:43 node004 kernel: [<ffffffffa0998760>] ? gfs2_get_block_direct+0x0/0x20 [gfs2] Sep 21 12:54:43 node004 kernel: [<ffffffffa0998990>] gfs2_direct_IO+0x100/0x110 [gfs2] Sep 21 12:54:43 node004 kernel: [<ffffffffa0998760>] ? gfs2_get_block_direct+0x0/0x20 [gfs2] Sep 21 12:54:43 node004 kernel: [<ffffffffa09988ec>] ? gfs2_direct_IO+0x5c/0x110 [gfs2] Sep 21 12:54:43 node004 kernel: [<ffffffff81114d32>] generic_file_direct_write+0xc2/0x190 Sep 21 12:54:43 node004 kernel: [<ffffffff81116545>] __generic_file_aio_write+0x345/0x480 Sep 21 12:54:43 node004 kernel: [<ffffffff8100ba4e>] ? common_interrupt+0xe/0x13 Sep 21 12:54:43 node004 kernel: [<ffffffff814fea54>] ? mutex_unlock+0x14/0x20 Sep 21 12:54:43 node004 kernel: [<ffffffff811166ef>] generic_file_aio_write+0x6f/0xe0 Sep 21 12:54:43 node004 kernel: [<ffffffffa099b8be>] gfs2_file_aio_write+0x7e/0xb0 [gfs2] Sep 21 12:54:43 node004 kernel: [<ffffffff8100bdae>] ? call_function_single_interrupt+0xe/0x20 Sep 21 12:54:43 node004 kernel: [<ffffffff8117ac70>] ? do_sync_write+0x0/0x140 Sep 21 12:54:43 node004 kernel: [<ffffffff8117ad6a>] do_sync_write+0xfa/0x140 Sep 21 12:54:43 node004 kernel: [<ffffffff810920d0>] ? autoremove_wake_function+0x0/0x40 Sep 21 12:54:43 node004 kernel: [<ffffffff8100bdae>] ? call_function_single_interrupt+0xe/0x20 Sep 21 12:54:43 node004 kernel: [<ffffffff8121fd8b>] ? selinux_file_permission+0xfb/0x150 Sep 21 12:54:43 node004 kernel: [<ffffffff81213136>] ? security_file_permission+0x16/0x20 Sep 21 12:54:43 node004 kernel: [<ffffffff8117b068>] vfs_write+0xb8/0x1a0 Sep 21 12:54:43 node004 kernel: [<ffffffff8100bdae>] ? call_function_single_interrupt+0xe/0x20 Sep 21 12:54:43 node004 kernel: [<ffffffff8117ba81>] sys_write+0x51/0x90 Sep 21 12:54:43 node004 kernel: [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b Sep 21 12:54:43 node004 kernel: INFO: task iozone:23809 blocked for more than 120 seconds. Sep 21 12:54:43 node004 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Sep 21 12:54:43 node004 kernel: iozone D 0000000000000011 0 23809 22911 0x00000080 Sep 21 12:54:43 node004 kernel: ffff880fe76d3958 0000000000000082 0000000000000000 ffffffffa01bf1fc Sep 21 12:54:43 node004 kernel: ffff880fe76d3928 0000000075c0b5d9 0000000000000000 ffff880eb83d5bc0 Sep 21 12:54:43 node004 kernel: ffff880ff10f65f8 ffff880fe76d3fd8 000000000000fb88 ffff880ff10f65f8 Sep 21 12:54:43 node004 kernel: Call Trace: Sep 21 12:54:43 node004 kernel: [<ffffffffa01bf1fc>] ? dm_table_unplug_all+0x5c/0x100 [dm_mod] Sep 21 12:54:43 node004 kernel: [<ffffffff814fdfc3>] io_schedule+0x73/0xc0 Sep 21 12:54:43 node004 kernel: [<ffffffff811b663e>] __blockdev_direct_IO_newtrunc+0x6fe/0xb90 Sep 21 12:54:43 node004 kernel: [<ffffffffa098e570>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2] Sep 21 12:54:43 node004 kernel: [<ffffffffa0990ca8>] ? do_promote+0x208/0x330 [gfs2] Sep 21 12:54:43 node004 kernel: [<ffffffff811b6b2e>] __blockdev_direct_IO+0x5e/0xd0 Sep 21 12:54:43 node004 kernel: [<ffffffffa0998760>] ? gfs2_get_block_direct+0x0/0x20 [gfs2] Sep 21 12:54:43 node004 kernel: [<ffffffffa0998990>] gfs2_direct_IO+0x100/0x110 [gfs2] Sep 21 12:54:43 node004 kernel: [<ffffffffa0998760>] ? gfs2_get_block_direct+0x0/0x20 [gfs2] Sep 21 12:54:43 node004 kernel: [<ffffffffa09988ec>] ? gfs2_direct_IO+0x5c/0x110 [gfs2] Sep 21 12:54:43 node004 kernel: [<ffffffff81114d32>] generic_file_direct_write+0xc2/0x190 Sep 21 12:54:43 node004 kernel: [<ffffffff81116545>] __generic_file_aio_write+0x345/0x480 Sep 21 12:54:43 node004 kernel: [<ffffffff8100ba4e>] ? common_interrupt+0xe/0x13 Sep 21 12:54:43 node004 kernel: [<ffffffff811166ef>] generic_file_aio_write+0x6f/0xe0 Sep 21 12:54:43 node004 kernel: [<ffffffffa099b8be>] gfs2_file_aio_write+0x7e/0xb0 [gfs2] Sep 21 12:54:43 node004 kernel: [<ffffffff8100ba4e>] ? common_interrupt+0xe/0x13 Sep 21 12:54:43 node004 kernel: [<ffffffff8117ad6a>] do_sync_write+0xfa/0x140 Sep 21 12:54:43 node004 kernel: [<ffffffff810920d0>] ? autoremove_wake_function+0x0/0x40 Sep 21 12:54:43 node004 kernel: [<ffffffff8121fd8b>] ? selinux_file_permission+0xfb/0x150 Sep 21 12:54:43 node004 kernel: [<ffffffff81213136>] ? security_file_permission+0x16/0x20 Sep 21 12:54:43 node004 kernel: [<ffffffff8117b068>] vfs_write+0xb8/0x1a0 Sep 21 12:54:43 node004 kernel: [<ffffffff8117ba81>] sys_write+0x51/0x90 Sep 21 12:54:43 node004 kernel: [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b Sep 21 12:54:43 node004 kernel: INFO: task iozone:23810 blocked for more than 120 seconds. Sep 21 12:54:43 node004 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Sep 21 12:54:43 node004 kernel: iozone D 0000000000000011 0 23810 22911 0x00000080 Sep 21 12:54:43 node004 kernel: ffff880e4e045958 0000000000000086 0000000000000000 ffffffffa01bf1fc Sep 21 12:54:43 node004 kernel: ffff880e4e045928 00000000ab0cddda 0000000000000000 ffff88100d2fd6c0 Sep 21 12:54:43 node004 kernel: ffff8810079ba5f8 ffff880e4e045fd8 000000000000fb88 ffff8810079ba5f8 Sep 21 12:54:43 node004 kernel: Call Trace: Sep 21 12:54:43 node004 kernel: [<ffffffffa01bf1fc>] ? dm_table_unplug_all+0x5c/0x100 [dm_mod] Sep 21 12:54:43 node004 kernel: [<ffffffff814fdfc3>] io_schedule+0x73/0xc0 Sep 21 12:54:43 node004 kernel: [<ffffffff811b663e>] __blockdev_direct_IO_newtrunc+0x6fe/0xb90 Sep 21 12:54:43 node004 kernel: [<ffffffffa098e570>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2] Sep 21 12:54:43 node004 kernel: [<ffffffffa0990ca8>] ? do_promote+0x208/0x330 [gfs2] Sep 21 12:54:43 node004 kernel: [<ffffffff811b6b2e>] __blockdev_direct_IO+0x5e/0xd0 Sep 21 12:54:43 node004 kernel: [<ffffffffa0998760>] ? gfs2_get_block_direct+0x0/0x20 [gfs2] Sep 21 12:54:43 node004 kernel: [<ffffffffa0998990>] gfs2_direct_IO+0x100/0x110 [gfs2] Sep 21 12:54:43 node004 kernel: [<ffffffffa0998760>] ? gfs2_get_block_direct+0x0/0x20 [gfs2] Sep 21 12:54:43 node004 kernel: [<ffffffffa09988ec>] ? gfs2_direct_IO+0x5c/0x110 [gfs2] Sep 21 12:54:43 node004 kernel: [<ffffffff81114d32>] generic_file_direct_write+0xc2/0x190 Sep 21 12:54:43 node004 kernel: [<ffffffff81116545>] __generic_file_aio_write+0x345/0x480 Sep 21 12:54:43 node004 kernel: [<ffffffff8100bdae>] ? call_function_single_interrupt+0xe/0x20 Sep 21 12:54:43 node004 kernel: [<ffffffff811aa034>] ? generic_write_sync+0x24/0x50 Sep 21 12:54:43 node004 kernel: [<ffffffff811166ef>] generic_file_aio_write+0x6f/0xe0 Sep 21 12:54:43 node004 kernel: [<ffffffffa099b8be>] gfs2_file_aio_write+0x7e/0xb0 [gfs2] Sep 21 12:54:43 node004 kernel: [<ffffffff8100bdee>] ? reschedule_interrupt+0xe/0x20 Sep 21 12:54:43 node004 kernel: [<ffffffff8117ac70>] ? do_sync_write+0x0/0x140 Sep 21 12:54:43 node004 kernel: [<ffffffff8117ad6a>] do_sync_write+0xfa/0x140 Sep 21 12:54:43 node004 kernel: [<ffffffff810920d0>] ? autoremove_wake_function+0x0/0x40 Sep 21 12:54:43 node004 kernel: [<ffffffff8121fd8b>] ? selinux_file_permission+0xfb/0x150 Sep 21 12:54:43 node004 kernel: [<ffffffff81213136>] ? security_file_permission+0x16/0x20 Sep 21 12:54:43 node004 kernel: [<ffffffff8117b068>] vfs_write+0xb8/0x1a0 Sep 21 12:54:43 node004 kernel: [<ffffffff8117ba81>] sys_write+0x51/0x90 Sep 21 12:54:43 node004 kernel: [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b Sep 21 12:54:43 node004 kernel: INFO: task iozone:23811 blocked for more than 120 seconds. Sep 21 12:54:43 node004 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Sep 21 12:54:43 node004 kernel: iozone D 0000000000000008 0 23811 22911 0x00000080 Sep 21 12:54:43 node004 kernel: ffff88100c499958 0000000000000086 0000000000000000 ffffffffa01bf1fc Sep 21 12:54:43 node004 kernel: ffff88100c499928 000000004b5eff48 0000000000000000 ffff88201678ac80 Sep 21 12:54:43 node004 kernel: ffff88100db61ab8 ffff88100c499fd8 000000000000fb88 ffff88100db61ab8 Sep 21 12:54:43 node004 kernel: Call Trace: Sep 21 12:54:43 node004 kernel: [<ffffffffa01bf1fc>] ? dm_table_unplug_all+0x5c/0x100 [dm_mod] Sep 21 12:54:43 node004 kernel: [<ffffffff814fdfc3>] io_schedule+0x73/0xc0 Sep 21 12:54:43 node004 kernel: [<ffffffff811b663e>] __blockdev_direct_IO_newtrunc+0x6fe/0xb90 Sep 21 12:54:43 node004 kernel: [<ffffffffa098e570>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2] Sep 21 12:54:43 node004 kernel: [<ffffffffa0990ca8>] ? do_promote+0x208/0x330 [gfs2] Sep 21 12:54:43 node004 kernel: [<ffffffff811b6b2e>] __blockdev_direct_IO+0x5e/0xd0 Sep 21 12:54:43 node004 kernel: [<ffffffffa0998760>] ? gfs2_get_block_direct+0x0/0x20 [gfs2] Sep 21 12:54:43 node004 kernel: [<ffffffffa0998990>] gfs2_direct_IO+0x100/0x110 [gfs2] Sep 21 12:54:43 node004 kernel: [<ffffffffa0998760>] ? gfs2_get_block_direct+0x0/0x20 [gfs2] Sep 21 12:54:43 node004 kernel: [<ffffffffa09988ec>] ? gfs2_direct_IO+0x5c/0x110 [gfs2] Sep 21 12:54:43 node004 kernel: [<ffffffff81114d32>] generic_file_direct_write+0xc2/0x190 Sep 21 12:54:43 node004 kernel: [<ffffffff81116545>] __generic_file_aio_write+0x345/0x480 Sep 21 12:54:43 node004 kernel: [<ffffffff8100bc0e>] ? apic_timer_interrupt+0xe/0x20 Sep 21 12:54:43 node004 kernel: [<ffffffff811aa01d>] ? generic_write_sync+0xd/0x50 Sep 21 12:54:43 node004 kernel: [<ffffffff811166ef>] generic_file_aio_write+0x6f/0xe0 Sep 21 12:54:43 node004 kernel: [<ffffffffa099b8be>] gfs2_file_aio_write+0x7e/0xb0 [gfs2] Sep 21 12:54:43 node004 kernel: [<ffffffff8100ba4e>] ? common_interrupt+0xe/0x13 Sep 21 12:54:43 node004 kernel: [<ffffffff8117ac70>] ? do_sync_write+0x0/0x140 Sep 21 12:54:43 node004 kernel: [<ffffffff8117ad6a>] do_sync_write+0xfa/0x140 Sep 21 12:54:43 node004 kernel: [<ffffffff810920d0>] ? autoremove_wake_function+0x0/0x40 Sep 21 12:54:43 node004 kernel: [<ffffffff8121fd8b>] ? selinux_file_permission+0xfb/0x150 Sep 21 12:54:43 node004 kernel: [<ffffffff81213136>] ? security_file_permission+0x16/0x20 Sep 21 12:54:43 node004 kernel: [<ffffffff8117b068>] vfs_write+0xb8/0x1a0 Sep 21 12:54:43 node004 kernel: [<ffffffff8117ba81>] sys_write+0x51/0x90 Sep 21 12:54:43 node004 kernel: [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b Sep 21 12:54:43 node004 kernel: INFO: task iozone:23813 blocked for more than 120 seconds. Sep 21 12:54:43 node004 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Sep 21 12:54:43 node004 kernel: iozone D 0000000000000011 0 23813 22911 0x00000080 Sep 21 12:54:43 node004 kernel: ffff880dd073f958 0000000000000086 0000000000000000 ffffffffa01bf1fc Sep 21 12:54:43 node004 kernel: ffff880dd073f928 00000000f66794ee 0000000000000000 ffff881007a7ba80 Sep 21 12:54:43 node004 kernel: ffff880fe78c7af8 ffff880dd073ffd8 000000000000fb88 ffff880fe78c7af8 Sep 21 12:54:43 node004 kernel: Call Trace: Sep 21 12:54:43 node004 kernel: [<ffffffffa01bf1fc>] ? dm_table_unplug_all+0x5c/0x100 [dm_mod] Sep 21 12:54:43 node004 kernel: [<ffffffff814fdfc3>] io_schedule+0x73/0xc0 Sep 21 12:54:43 node004 kernel: [<ffffffff811b663e>] __blockdev_direct_IO_newtrunc+0x6fe/0xb90 Sep 21 12:54:43 node004 kernel: [<ffffffffa098e570>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2] Sep 21 12:54:43 node004 kernel: [<ffffffffa0990ca8>] ? do_promote+0x208/0x330 [gfs2] Sep 21 12:54:43 node004 kernel: [<ffffffff811b6b2e>] __blockdev_direct_IO+0x5e/0xd0 Sep 21 12:54:43 node004 kernel: [<ffffffffa0998760>] ? gfs2_get_block_direct+0x0/0x20 [gfs2] Sep 21 12:54:43 node004 kernel: [<ffffffffa0998990>] gfs2_direct_IO+0x100/0x110 [gfs2] Sep 21 12:54:43 node004 kernel: [<ffffffffa0998760>] ? gfs2_get_block_direct+0x0/0x20 [gfs2] Sep 21 12:54:43 node004 kernel: [<ffffffffa09988ec>] ? gfs2_direct_IO+0x5c/0x110 [gfs2] Sep 21 12:54:43 node004 kernel: [<ffffffff81114d32>] generic_file_direct_write+0xc2/0x190 Sep 21 12:54:43 node004 kernel: [<ffffffff81116545>] __generic_file_aio_write+0x345/0x480 Sep 21 12:54:43 node004 kernel: [<ffffffff8100bdae>] ? call_function_single_interrupt+0xe/0x20 Sep 21 12:54:43 node004 kernel: [<ffffffff814fea41>] ? mutex_unlock+0x1/0x20 Sep 21 12:54:43 node004 kernel: [<ffffffff811166ef>] generic_file_aio_write+0x6f/0xe0 Sep 21 12:54:43 node004 kernel: [<ffffffffa099b8be>] gfs2_file_aio_write+0x7e/0xb0 [gfs2] Sep 21 12:54:43 node004 kernel: [<ffffffff8100ba4e>] ? common_interrupt+0xe/0x13 Sep 21 12:54:43 node004 kernel: [<ffffffff8117ad6a>] do_sync_write+0xfa/0x140 Sep 21 12:54:43 node004 kernel: [<ffffffff810920d0>] ? autoremove_wake_function+0x0/0x40 Sep 21 12:54:43 node004 kernel: [<ffffffff8121fd8b>] ? selinux_file_permission+0xfb/0x150 Sep 21 12:54:43 node004 kernel: [<ffffffff81213136>] ? security_file_permission+0x16/0x20 Sep 21 12:54:43 node004 kernel: [<ffffffff8117b068>] vfs_write+0xb8/0x1a0 Sep 21 12:54:43 node004 kernel: [<ffffffff811937f0>] ? dput+0x0/0x150 Sep 21 12:54:43 node004 kernel: [<ffffffff8117ba81>] sys_write+0x51/0x90 Sep 21 12:54:43 node004 kernel: [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b Sep 21 12:54:43 node004 kernel: INFO: task iozone:23814 blocked for more than 120 seconds. Sep 21 12:54:43 node004 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Sep 21 12:54:43 node004 kernel: iozone D 0000000000000011 0 23814 22911 0x00000080 Sep 21 12:54:43 node004 kernel: ffff880e4e119958 0000000000000082 0000000000000000 ffffffffa01bf1fc Sep 21 12:54:43 node004 kernel: ffff880e4e119928 00000000fa09a3ba 0000000000000000 ffff88100d2fd300 Sep 21 12:54:43 node004 kernel: ffff880fe78c7098 ffff880e4e119fd8 000000000000fb88 ffff880fe78c7098 Sep 21 12:54:43 node004 kernel: Call Trace: Sep 21 12:54:43 node004 kernel: [<ffffffffa01bf1fc>] ? dm_table_unplug_all+0x5c/0x100 [dm_mod] Sep 21 12:54:43 node004 kernel: [<ffffffff814fdfc3>] io_schedule+0x73/0xc0 Sep 21 12:54:43 node004 kernel: [<ffffffff811b663e>] __blockdev_direct_IO_newtrunc+0x6fe/0xb90 Sep 21 12:54:43 node004 kernel: [<ffffffffa098e570>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2] Sep 21 12:54:43 node004 kernel: [<ffffffffa0990ca8>] ? do_promote+0x208/0x330 [gfs2] Sep 21 12:54:43 node004 kernel: [<ffffffff811b6b2e>] __blockdev_direct_IO+0x5e/0xd0 Sep 21 12:54:43 node004 kernel: [<ffffffffa0998760>] ? gfs2_get_block_direct+0x0/0x20 [gfs2] Sep 21 12:54:43 node004 kernel: [<ffffffffa0998990>] gfs2_direct_IO+0x100/0x110 [gfs2] Sep 21 12:54:43 node004 kernel: [<ffffffffa0998760>] ? gfs2_get_block_direct+0x0/0x20 [gfs2] Sep 21 12:54:43 node004 kernel: [<ffffffffa09988ec>] ? gfs2_direct_IO+0x5c/0x110 [gfs2] Sep 21 12:54:43 node004 kernel: [<ffffffff81114d32>] generic_file_direct_write+0xc2/0x190 Sep 21 12:54:43 node004 kernel: [<ffffffff81116545>] __generic_file_aio_write+0x345/0x480 Sep 21 12:54:43 node004 kernel: [<ffffffff8100bdae>] ? call_function_single_interrupt+0xe/0x20 Sep 21 12:54:43 node004 kernel: [<ffffffff8100bc0e>] ? apic_timer_interrupt+0xe/0x20 Sep 21 12:54:43 node004 kernel: [<ffffffff811aa034>] ? generic_write_sync+0x24/0x50 Sep 21 12:54:43 node004 kernel: [<ffffffff811166ef>] generic_file_aio_write+0x6f/0xe0 Sep 21 12:54:43 node004 kernel: [<ffffffffa099b8be>] gfs2_file_aio_write+0x7e/0xb0 [gfs2] Sep 21 12:54:43 node004 kernel: [<ffffffff8100ba4e>] ? common_interrupt+0xe/0x13 Sep 21 12:54:43 node004 kernel: [<ffffffff8117ac70>] ? do_sync_write+0x0/0x140 Sep 21 12:54:43 node004 kernel: [<ffffffff8117ad6a>] do_sync_write+0xfa/0x140 Sep 21 12:54:43 node004 kernel: [<ffffffff810920d0>] ? autoremove_wake_function+0x0/0x40 Sep 21 12:54:43 node004 kernel: [<ffffffff8121fd8b>] ? selinux_file_permission+0xfb/0x150 Sep 21 12:54:43 node004 kernel: [<ffffffff81213136>] ? security_file_permission+0x16/0x20 Sep 21 12:54:43 node004 kernel: [<ffffffff8117b068>] vfs_write+0xb8/0x1a0 Sep 21 12:54:43 node004 kernel: [<ffffffff8117ba81>] sys_write+0x51/0x90 Sep 21 12:54:43 node004 kernel: [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b Sep 21 12:55:38 node004 kernel: sd 6:0:0:0: timing out command, waited 180s Sep 21 12:55:38 node004 kernel: sd 6:0:0:0: [sdj] Unhandled error code Sep 21 12:55:38 node004 kernel: sd 6:0:0:0: [sdj] Result: hostbyte=DID_OK driverbyte=DRIVER_OK Sep 21 12:55:38 node004 kernel: sd 6:0:0:0: [sdj] CDB: Write(10): 2a 00 08 69 0a 88 00 00 20 00 Sep 21 12:55:39 node004 kernel: sd 6:0:0:0: timing out command, waited 180s Sep 21 12:55:39 node004 kernel: sd 6:0:0:0: [sdj] Unhandled error code Sep 21 12:55:39 node004 kernel: sd 6:0:0:0: [sdj] Result: hostbyte=DID_OK driverbyte=DRIVER_OK Sep 21 12:55:39 node004 kernel: sd 6:0:0:0: [sdj] CDB: Write(10): 2a 00 03 ce 6a c0 00 00 20 00 Sep 21 12:55:41 node004 kernel: sd 6:0:0:0: timing out command, waited 180s Sep 21 12:55:41 node004 kernel: sd 6:0:0:0: [sdj] Unhandled error code Sep 21 12:55:41 node004 kernel: sd 6:0:0:0: [sdj] Result: hostbyte=DID_OK driverbyte=DRIVER_OK Sep 21 12:55:41 node004 kernel: sd 6:0:0:0: [sdj] CDB: Write(10): 2a 00 01 81 f2 28 00 00 20 00 Sep 21 12:55:46 node004 kernel: sd 6:0:0:0: timing out command, waited 180s Sep 21 12:55:46 node004 kernel: sd 6:0:0:0: [sdj] Unhandled error code Sep 21 12:55:46 node004 kernel: sd 6:0:0:0: [sdj] Result: hostbyte=DID_OK driverbyte=DRIVER_OK Sep 21 12:55:46 node004 kernel: sd 6:0:0:0: [sdj] CDB: Write(10): 2a 00 06 e6 0b 58 00 00 20 00 Sep 21 12:55:47 node004 kernel: sd 6:0:0:0: timing out command, waited 180s Sep 21 12:55:47 node004 kernel: sd 6:0:0:0: [sdj] Unhandled error code Sep 21 12:55:47 node004 kernel: sd 6:0:0:0: [sdj] Result: hostbyte=DID_OK driverbyte=DRIVER_OK Sep 21 12:55:47 node004 kernel: sd 6:0:0:0: [sdj] CDB: Write(10): 2a 00 06 48 cb 48 00 00 20 00 Sep 21 12:55:49 node004 kernel: sd 6:0:0:0: timing out command, waited 180s Sep 21 12:55:49 node004 kernel: sd 6:0:0:0: [sdj] Unhandled error code Sep 21 12:55:49 node004 kernel: sd 6:0:0:0: [sdj] Result: hostbyte=DID_OK driverbyte=DRIVER_OK Sep 21 12:55:49 node004 kernel: sd 6:0:0:0: [sdj] CDB: Write(10): 2a 00 07 64 ad 88 00 00 20 00 Sep 21 12:55:50 node004 kernel: sd 6:0:0:0: timing out command, waited 180s Sep 21 12:55:50 node004 kernel: sd 6:0:0:0: [sdj] Unhandled error code Sep 21 12:55:50 node004 kernel: sd 6:0:0:0: [sdj] Result: hostbyte=DID_OK driverbyte=DRIVER_OK Sep 21 12:55:50 node004 kernel: sd 6:0:0:0: [sdj] CDB: Write(10): 2a 00 05 79 27 80 00 00 20 00 Sep 21 12:55:52 node004 kernel: sd 6:0:0:0: timing out command, waited 180s Sep 21 12:55:52 node004 kernel: sd 6:0:0:0: [sdj] Unhandled error code Sep 21 12:55:52 node004 kernel: sd 6:0:0:0: [sdj] Result: hostbyte=DID_OK driverbyte=DRIVER_OK Sep 21 12:55:52 node004 kernel: sd 6:0:0:0: [sdj] CDB: Write(10): 2a 00 01 6c 26 60 00 00 20 00 Sep 21 12:55:54 node004 kernel: sd 6:0:0:0: timing out command, waited 180s Sep 21 12:55:54 node004 kernel: sd 6:0:0:0: [sdj] Unhandled error code Sep 21 12:55:54 node004 kernel: sd 6:0:0:0: [sdj] Result: hostbyte=DID_OK driverbyte=DRIVER_OK Sep 21 12:55:54 node004 kernel: sd 6:0:0:0: [sdj] CDB: Write(10): 2a 00 04 de 20 c8 00 00 20 00 Sep 21 12:55:55 node004 kernel: sd 6:0:0:0: timing out command, waited 180s Sep 21 12:55:55 node004 kernel: sd 6:0:0:0: [sdj] Unhandled error code Sep 21 12:55:55 node004 kernel: sd 6:0:0:0: [sdj] Result: hostbyte=DID_OK driverbyte=DRIVER_OK Sep 21 12:55:55 node004 kernel: sd 6:0:0:0: [sdj] CDB: Write(10): 2a 00 03 2a 6c 20 00 00 20 00 Sep 21 12:55:57 node004 kernel: sd 6:0:0:0: timing out command, waited 180s Sep 21 12:55:57 node004 kernel: sd 6:0:0:0: [sdj] Unhandled error code Sep 21 12:55:57 node004 kernel: sd 6:0:0:0: [sdj] Result: hostbyte=DID_OK driverbyte=DRIVER_OK Sep 21 12:55:57 node004 kernel: sd 6:0:0:0: [sdj] CDB: Write(10): 2a 00 01 92 25 28 00 00 20 00 Sep 21 12:55:58 node004 kernel: sd 6:0:0:0: timing out command, waited 180s Sep 21 12:55:58 node004 kernel: sd 6:0:0:0: [sdj] Unhandled error code Sep 21 12:55:58 node004 kernel: sd 6:0:0:0: [sdj] Result: hostbyte=DID_OK driverbyte=DRIVER_OK Sep 21 12:55:58 node004 kernel: sd 6:0:0:0: [sdj] CDB: Write(10): 2a 00 06 03 33 90 00 00 20 00 Sep 21 12:56:00 node004 kernel: sd 6:0:0:0: timing out command, waited 180s Sep 21 12:56:00 node004 kernel: sd 6:0:0:0: [sdj] Unhandled error code Sep 21 12:56:00 node004 kernel: sd 6:0:0:0: [sdj] Result: hostbyte=DID_OK driverbyte=DRIVER_OK Sep 21 12:56:00 node004 kernel: sd 6:0:0:0: [sdj] CDB: Write(10): 2a 00 01 23 7e 40 00 00 20 00 Sep 21 12:56:02 node004 kernel: sd 6:0:0:0: timing out command, waited 180s Sep 21 12:56:02 node004 kernel: sd 6:0:0:0: [sdj] Unhandled error code Sep 21 12:56:02 node004 kernel: sd 6:0:0:0: [sdj] Result: hostbyte=DID_OK driverbyte=DRIVER_OK Sep 21 12:56:02 node004 kernel: sd 6:0:0:0: [sdj] CDB: Write(10): 2a 00 07 73 c9 18 00 00 20 00 Sep 21 12:56:03 node004 kernel: sd 6:0:0:0: timing out command, waited 180s Sep 21 12:56:03 node004 kernel: sd 6:0:0:0: [sdj] Unhandled error code Sep 21 12:56:03 node004 kernel: sd 6:0:0:0: [sdj] Result: hostbyte=DID_OK driverbyte=DRIVER_OK Sep 21 12:56:03 node004 kernel: sd 6:0:0:0: [sdj] CDB: Write(10): 2a 00 01 4f 6a a8 00 00 20 00 Sep 21 12:56:05 node004 kernel: sd 6:0:0:0: timing out command, waited 180s Sep 21 12:56:05 node004 kernel: sd 6:0:0:0: [sdj] Unhandled error code Sep 21 12:56:05 node004 kernel: sd 6:0:0:0: [sdj] Result: hostbyte=DID_OK driverbyte=DRIVER_OK Sep 21 12:56:05 node004 kernel: sd 6:0:0:0: [sdj] CDB: Write(10): 2a 00 04 3e 9c c8 00 00 20 00 Sep 21 12:59:05 node004 kernel: sd 6:0:0:0: timing out command, waited 180s Sep 21 12:59:05 node004 kernel: sd 6:0:0:0: [sdj] Unhandled error code Sep 21 12:59:05 node004 kernel: sd 6:0:0:0: [sdj] Result: hostbyte=DID_OK driverbyte=DRIVER_OK Sep 21 12:59:05 node004 kernel: sd 6:0:0:0: [sdj] CDB: Write(10): 2a 00 00 09 ea c8 00 00 28 00 Sep 21 12:59:05 node004 kernel: Buffer I/O error on device dm-0, logical block 80985 Sep 21 12:59:05 node004 kernel: lost page write due to I/O error on dm-0 Sep 21 12:59:05 node004 kernel: Buffer I/O error on device dm-0, logical block 80986 Sep 21 12:59:05 node004 kernel: lost page write due to I/O error on dm-0 Sep 21 12:59:05 node004 kernel: Buffer I/O error on device dm-0, logical block 80987 Sep 21 12:59:05 node004 kernel: lost page write due to I/O error on dm-0 Sep 21 12:59:05 node004 kernel: Buffer I/O error on device dm-0, logical block 80988 Sep 21 12:59:05 node004 kernel: lost page write due to I/O error on dm-0 Sep 21 12:59:05 node004 kernel: Buffer I/O error on device dm-0, logical block 80989 Sep 21 12:59:05 node004 kernel: lost page write due to I/O error on dm-0 Sep 21 13:00:10 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[4d 00 40 00 00 00 00 00 04 00 00 00 00 00 00 00] Sep 21 13:00:11 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[37 00 0c 00 00 00 00 00 04 00 00 00 00 00 00 00] Sep 21 13:02:05 node004 kernel: sd 6:0:0:0: timing out command, waited 180s Sep 21 13:02:05 node004 kernel: sd 6:0:0:0: [sdj] Unhandled error code Sep 21 13:02:05 node004 kernel: sd 6:0:0:0: [sdj] Result: hostbyte=DID_OK driverbyte=DRIVER_OK Sep 21 13:02:05 node004 kernel: sd 6:0:0:0: [sdj] CDB: Write(10): 2a 00 00 09 ea f0 00 00 08 00 Sep 21 13:02:05 node004 kernel: Buffer I/O error on device dm-0, logical block 80990 Sep 21 13:02:05 node004 kernel: lost page write due to I/O error on dm-0 Sep 21 13:02:05 node004 kernel: GFS2: fsid=nimble_cluster:gfs_test.2: fatal: I/O error Sep 21 13:02:05 node004 kernel: GFS2: fsid=nimble_cluster:gfs_test.2: block = 80990 Sep 21 13:02:05 node004 kernel: GFS2: fsid=nimble_cluster:gfs_test.2: function = log_write_header, file = fs/gfs2/log.c, line = 616 Sep 21 13:02:05 node004 kernel: GFS2: fsid=nimble_cluster:gfs_test.2: about to withdraw this file system Sep 21 13:02:05 node004 kernel: GFS2: fsid=nimble_cluster:gfs_test.2: telling LM to unmount Sep 21 13:02:05 node004 kernel: GFS2: fsid=nimble_cluster:gfs_test.2: withdrawn Sep 21 13:02:05 node004 kernel: Pid: 22758, comm: glock_workqueue Not tainted 2.6.32-279.el6.x86_64 #1 Sep 21 13:02:05 node004 kernel: Call Trace: Sep 21 13:02:05 node004 kernel: [<ffffffffa09ad062>] ? gfs2_lm_withdraw+0x102/0x130 [gfs2] Sep 21 13:02:05 node004 kernel: [<ffffffff814fea28>] ? out_of_line_wait_on_bit+0x78/0x90 Sep 21 13:02:05 node004 kernel: [<ffffffff81092110>] ? wake_bit_function+0x0/0x50 Sep 21 13:02:05 node004 kernel: [<ffffffffa09ad0d0>] ? gfs2_io_error_bh_i+0x40/0x50 [gfs2] Sep 21 13:02:05 node004 kernel: [<ffffffff811adfb6>] ? __wait_on_buffer+0x26/0x30 Sep 21 13:02:05 node004 kernel: [<ffffffffa0995288>] ? log_write_header+0x3a8/0x490 [gfs2] Sep 21 13:02:05 node004 kernel: [<ffffffffa0995951>] ? gfs2_log_flush+0x301/0x6f0 [gfs2] Sep 21 13:02:05 node004 kernel: [<ffffffff810629d3>] ? dequeue_entity+0x113/0x2e0 Sep 21 13:02:05 node004 kernel: [<ffffffff810920d0>] ? autoremove_wake_function+0x0/0x40 Sep 21 13:02:05 node004 kernel: [<ffffffffa09927d0>] ? inode_go_sync+0x80/0x160 [gfs2] Sep 21 13:02:05 node004 kernel: [<ffffffffa0991336>] ? do_xmote+0x156/0x280 [gfs2] Sep 21 13:02:05 node004 kernel: [<ffffffff814fd830>] ? thread_return+0x4e/0x76e Sep 21 13:02:05 node004 kernel: [<ffffffffa0991551>] ? run_queue+0xf1/0x1d0 [gfs2] Sep 21 13:02:05 node004 kernel: [<ffffffffa0991d2a>] ? glock_work_func+0x7a/0x1b0 [gfs2] Sep 21 13:02:05 node004 kernel: [<ffffffffa0991cb0>] ? glock_work_func+0x0/0x1b0 [gfs2] Sep 21 13:02:05 node004 kernel: [<ffffffff8108c760>] ? worker_thread+0x170/0x2a0 Sep 21 13:02:05 node004 kernel: [<ffffffff810920d0>] ? autoremove_wake_function+0x0/0x40 Sep 21 13:02:05 node004 kernel: [<ffffffff8108c5f0>] ? worker_thread+0x0/0x2a0 Sep 21 13:02:05 node004 kernel: [<ffffffff81091d66>] ? kthread+0x96/0xa0 Sep 21 13:02:05 node004 kernel: [<ffffffff8100c14a>] ? child_rip+0xa/0x20 Sep 21 13:02:05 node004 kernel: [<ffffffff81091cd0>] ? kthread+0x0/0xa0 Sep 21 13:02:05 node004 kernel: [<ffffffff8100c140>] ? child_rip+0x0/0x20 Sep 21 13:30:10 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[4d 00 40 00 00 00 00 00 04 00 00 00 00 00 00 00] Sep 21 13:30:10 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[37 00 0c 00 00 00 00 00 04 00 00 00 00 00 00 00] Sep 21 13:30:10 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[4d 00 40 00 00 00 00 00 04 00 00 00 00 00 00 00] Sep 21 13:30:10 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[37 00 0c 00 00 00 00 00 04 00 00 00 00 00 00 00] Sep 21 14:00:10 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[4d 00 40 00 00 00 00 00 04 00 00 00 00 00 00 00] Sep 21 14:00:10 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[37 00 0c 00 00 00 00 00 04 00 00 00 00 00 00 00] Sep 21 14:00:10 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[4d 00 40 00 00 00 00 00 04 00 00 00 00 00 00 00] Sep 21 14:00:10 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[37 00 0c 00 00 00 00 00 04 00 00 00 00 00 00 00] Sep 21 14:30:10 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[4d 00 40 00 00 00 00 00 04 00 00 00 00 00 00 00] Sep 21 14:30:10 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[37 00 0c 00 00 00 00 00 04 00 00 00 00 00 00 00] Sep 21 14:30:10 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[4d 00 40 00 00 00 00 00 04 00 00 00 00 00 00 00] Sep 21 14:30:10 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[37 00 0c 00 00 00 00 00 04 00 00 00 00 00 00 00] Sep 21 15:00:11 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[4d 00 40 00 00 00 00 00 04 00 00 00 00 00 00 00] Sep 21 15:00:11 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[37 00 0c 00 00 00 00 00 04 00 00 00 00 00 00 00] Sep 21 15:00:11 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[4d 00 40 00 00 00 00 00 04 00 00 00 00 00 00 00] Sep 21 15:00:11 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[37 00 0c 00 00 00 00 00 04 00 00 00 00 00 00 00] Sep 21 15:30:10 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[4d 00 40 00 00 00 00 00 04 00 00 00 00 00 00 00] Sep 21 15:30:10 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[37 00 0c 00 00 00 00 00 04 00 00 00 00 00 00 00] Sep 21 15:30:10 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[4d 00 40 00 00 00 00 00 04 00 00 00 00 00 00 00] Sep 21 15:30:10 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[37 00 0c 00 00 00 00 00 04 00 00 00 00 00 00 00] > > Steve. > >> Sep 20 15:59:09 node004 kernel: sd 3:0:0:0: timing out command, waited 180s >> Sep 20 15:59:09 node004 kernel: sd 3:0:0:0: [sdi] Unhandled error code >> Sep 20 15:59:09 node004 kernel: sd 3:0:0:0: [sdi] Result: hostbyte=DID_OK driverbyte=DRIVER_OK >> Sep 20 15:59:09 node004 kernel: sd 3:0:0:0: [sdi] CDB: Write(10): 2a 00 01 94 88 a0 00 00 20 00 >> Sep 20 15:59:11 node004 kernel: sd 3:0:0:0: timing out command, waited 180s >> Sep 20 15:59:11 node004 kernel: sd 3:0:0:0: [sdi] Unhandled error code >> Sep 20 15:59:11 node004 kernel: sd 3:0:0:0: [sdi] Result: hostbyte=DID_OK driverbyte=DRIVER_OK >> Sep 20 15:59:11 node004 kernel: sd 3:0:0:0: [sdi] CDB: Write(10): 2a 00 06 51 ff 90 00 00 20 00 >> Sep 20 15:59:13 node004 kernel: sd 3:0:0:0: timing out command, waited 180s >> Sep 20 15:59:13 node004 kernel: sd 3:0:0:0: [sdi] Unhandled error code >> Sep 20 15:59:13 node004 kernel: sd 3:0:0:0: [sdi] Result: hostbyte=DID_OK driverbyte=DRIVER_OK >> Sep 20 15:59:13 node004 kernel: sd 3:0:0:0: [sdi] CDB: Write(10): 2a 00 06 46 d5 c0 00 00 20 00 >> Sep 20 15:59:14 node004 kernel: sd 3:0:0:0: timing out command, waited 180s >> Sep 20 15:59:14 node004 kernel: sd 3:0:0:0: [sdi] Unhandled error code >> Sep 20 15:59:14 node004 kernel: sd 3:0:0:0: [sdi] Result: hostbyte=DID_OK driverbyte=DRIVER_OK >> Sep 20 15:59:14 node004 kernel: sd 3:0:0:0: [sdi] CDB: Write(10): 2a 00 03 da c7 78 00 00 20 00 >> Sep 20 15:59:16 node004 kernel: sd 3:0:0:0: timing out command, waited 180s >> Sep 20 15:59:16 node004 kernel: sd 3:0:0:0: [sdi] Unhandled error code >> Sep 20 15:59:16 node004 kernel: sd 3:0:0:0: [sdi] Result: hostbyte=DID_OK driverbyte=DRIVER_OK >> Sep 20 15:59:16 node004 kernel: sd 3:0:0:0: [sdi] CDB: Write(10): 2a 00 06 f5 8f 60 00 00 20 00 >> Sep 20 15:59:17 node004 kernel: sd 3:0:0:0: timing out command, waited 180s >> Sep 20 15:59:17 node004 kernel: sd 3:0:0:0: [sdi] Unhandled error code >> Sep 20 15:59:17 node004 kernel: sd 3:0:0:0: [sdi] Result: hostbyte=DID_OK driverbyte=DRIVER_OK >> Sep 20 15:59:17 node004 kernel: sd 3:0:0:0: [sdi] CDB: Write(10): 2a 00 01 30 7c 90 00 00 20 00 >> Sep 20 15:59:19 node004 kernel: sd 3:0:0:0: timing out command, waited 180s >> Sep 20 15:59:19 node004 kernel: sd 3:0:0:0: [sdi] Unhandled error code >> Sep 20 15:59:19 node004 kernel: sd 3:0:0:0: [sdi] Result: hostbyte=DID_OK driverbyte=DRIVER_OK >> Sep 20 15:59:19 node004 kernel: sd 3:0:0:0: [sdi] CDB: Write(10): 2a 00 05 79 8b e0 00 00 20 00 >> Sep 20 15:59:20 node004 kernel: sd 3:0:0:0: timing out command, waited 180s >> Sep 20 15:59:20 node004 kernel: sd 3:0:0:0: [sdi] Unhandled error code >> Sep 20 15:59:20 node004 kernel: sd 3:0:0:0: [sdi] Result: hostbyte=DID_OK driverbyte=DRIVER_OK >> Sep 20 15:59:20 node004 kernel: sd 3:0:0:0: [sdi] CDB: Write(10): 2a 00 04 37 13 08 00 00 20 00 >> Sep 20 16:00:11 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[4d 00 40 00 00 00 00 00 04 00 00 00 00 00 00 00] >> Sep 20 16:00:11 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[37 00 0c 00 00 00 00 00 04 00 00 00 00 00 00 00] >> Sep 20 16:00:11 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[4d 00 40 00 00 00 00 00 04 00 00 00 00 00 00 00] >> Sep 20 16:00:11 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[37 00 0c 00 00 00 00 00 04 00 00 00 00 00 00 00] >> Sep 20 16:00:11 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[4d 00 40 00 00 00 00 00 04 00 00 00 00 00 00 00] >> Sep 20 16:00:11 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[37 00 0c 00 00 00 00 00 04 00 00 00 00 00 00 00] >> Sep 20 16:00:11 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[4d 00 40 00 00 00 00 00 04 00 00 00 00 00 00 00] >> Sep 20 16:00:11 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[37 00 0c 00 00 00 00 00 04 00 00 00 00 00 00 00] >> Sep 20 16:00:11 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[4d 00 40 00 00 00 00 00 04 00 00 00 00 00 00 00] >> Sep 20 16:00:11 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[37 00 0c 00 00 00 00 00 04 00 00 00 00 00 00 00] >> Sep 20 16:00:11 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[4d 00 40 00 00 00 00 00 04 00 00 00 00 00 00 00] >> Sep 20 16:00:11 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[37 00 0c 00 00 00 00 00 04 00 00 00 00 00 00 00] >> Sep 20 16:00:11 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[4d 00 40 00 00 00 00 00 04 00 00 00 00 00 00 00] >> Sep 20 16:00:11 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[37 00 0c 00 00 00 00 00 04 00 00 00 00 00 00 00] >> Sep 20 16:00:11 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[4d 00 40 00 00 00 00 00 04 00 00 00 00 00 00 00] >> Sep 20 16:00:11 node004 kernel: hpsa 0000:02:00.0: cp ffff88007f900000 has check condition: unknown type: Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2, cmd=[37 00 0c 00 00 00 00 00 04 00 00 00 00 00 00 00] >> Sep 20 16:02:15 node004 kernel: INFO: task glock_workqueue:9820 blocked for more than 120 seconds. >> Sep 20 16:02:15 node004 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >> Sep 20 16:02:15 node004 kernel: glock_workque D 000000000000001b 0 9820 2 0x00000080 >> Sep 20 16:02:15 node004 kernel: ffff8820150a9c70 0000000000000046 0000000000000004 00000000aa8f20cf >> Sep 20 16:02:15 node004 kernel: ffff881fffd050c8 0000000000000441 ffff8820150a9c10 ffffffff811acd5e >> Sep 20 16:02:15 node004 kernel: ffff882015b39ab8 ffff8820150a9fd8 000000000000fb88 ffff882015b39ab8 >> Sep 20 16:02:15 node004 kernel: Call Trace: >> Sep 20 16:02:15 node004 kernel: [<ffffffff811acd5e>] ? submit_bh+0x10e/0x150 >> Sep 20 16:02:15 node004 kernel: [<ffffffff8109cd39>] ? ktime_get_ts+0xa9/0xe0 >> Sep 20 16:02:15 node004 kernel: [<ffffffff814fdfc3>] io_schedule+0x73/0xc0 >> Sep 20 16:02:15 node004 kernel: [<ffffffffa094aaca>] gfs2_log_flush+0x47a/0x6f0 [gfs2] >> Sep 20 16:02:15 node004 kernel: [<ffffffff810629d3>] ? dequeue_entity+0x113/0x2e0 >> Sep 20 16:02:15 node004 kernel: [<ffffffff810920d0>] ? autoremove_wake_function+0x0/0x40 >> Sep 20 16:02:15 node004 kernel: [<ffffffffa09477d0>] inode_go_sync+0x80/0x160 [gfs2] >> Sep 20 16:02:15 node004 kernel: [<ffffffffa0946336>] do_xmote+0x156/0x280 [gfs2] >> Sep 20 16:02:15 node004 kernel: [<ffffffff814fd830>] ? thread_return+0x4e/0x76e >> Sep 20 16:02:15 node004 kernel: [<ffffffffa0946551>] run_queue+0xf1/0x1d0 [gfs2] >> Sep 20 16:02:15 node004 kernel: [<ffffffffa0946d2a>] glock_work_func+0x7a/0x1b0 [gfs2] >> Sep 20 16:02:15 node004 kernel: [<ffffffffa0946cb0>] ? glock_work_func+0x0/0x1b0 [gfs2] >> Sep 20 16:02:15 node004 kernel: [<ffffffff8108c760>] worker_thread+0x170/0x2a0 >> Sep 20 16:02:15 node004 kernel: [<ffffffff810920d0>] ? autoremove_wake_function+0x0/0x40 >> Sep 20 16:02:15 node004 kernel: [<ffffffff8108c5f0>] ? worker_thread+0x0/0x2a0 >> Sep 20 16:02:15 node004 kernel: [<ffffffff81091d66>] kthread+0x96/0xa0 >> Sep 20 16:02:15 node004 kernel: [<ffffffff8100c14a>] child_rip+0xa/0x20 >> Sep 20 16:02:15 node004 kernel: [<ffffffff81091cd0>] ? kthread+0x0/0xa0 >> Sep 20 16:02:15 node004 kernel: [<ffffffff8100c140>] ? child_rip+0x0/0x20 >> Sep 20 16:02:21 node004 kernel: sd 3:0:0:0: timing out command, waited 180s >> Sep 20 16:02:21 node004 kernel: sd 3:0:0:0: [sdi] Unhandled error code >> Sep 20 16:02:21 node004 kernel: sd 3:0:0:0: [sdi] Result: hostbyte=DID_OK driverbyte=DRIVER_OK >> Sep 20 16:02:21 node004 kernel: sd 3:0:0:0: [sdi] CDB: Write(10): 2a 00 00 0e 6e c0 00 00 20 00 >> Sep 20 16:02:21 node004 kernel: Buffer I/O error on device dm-6, logical block 117976 >> Sep 20 16:02:21 node004 kernel: lost page write due to I/O error on dm-6 >> Sep 20 16:02:21 node004 kernel: Buffer I/O error on device dm-6, logical block 117977 >> Sep 20 16:02:21 node004 kernel: lost page write due to I/O error on dm-6 >> Sep 20 16:02:21 node004 kernel: Buffer I/O error on device dm-6, logical block 117978 >> Sep 20 16:02:21 node004 kernel: lost page write due to I/O error on dm-6 >> Sep 20 16:02:21 node004 kernel: Buffer I/O error on device dm-6, logical block 117979 >> Sep 20 16:02:21 node004 kernel: lost page write due to I/O error on dm-6 >> Sep 20 16:02:22 node004 kernel: sd 3:0:0:0: timing out command, waited 180s >> Sep 20 16:02:22 node004 kernel: sd 3:0:0:0: [sdi] Unhandled error code >> Sep 20 16:02:22 node004 kernel: sd 3:0:0:0: [sdi] Result: hostbyte=DID_OK driverbyte=DRIVER_OK >> Sep 20 16:02:22 node004 kernel: sd 3:0:0:0: [sdi] CDB: Write(10): 2a 00 00 0e 6e b8 00 00 08 00 >> Sep 20 16:02:22 node004 kernel: Buffer I/O error on device dm-6, logical block 117975 >> Sep 20 16:02:22 node004 kernel: lost page write due to I/O error on dm-6 >> Sep 20 16:05:22 node004 kernel: sd 3:0:0:0: timing out command, waited 180s >> Sep 20 16:05:22 node004 kernel: sd 3:0:0:0: [sdi] Unhandled error code >> Sep 20 16:05:22 node004 kernel: sd 3:0:0:0: [sdi] Result: hostbyte=DID_OK driverbyte=DRIVER_OK >> Sep 20 16:05:22 node004 kernel: sd 3:0:0:0: [sdi] CDB: Write(10): 2a 00 00 0e 6e e0 00 00 08 00 >> Sep 20 16:05:22 node004 kernel: Buffer I/O error on device dm-6, logical block 117980 >> Sep 20 16:05:22 node004 kernel: lost page write due to I/O error on dm-6 >> Sep 20 16:05:22 node004 kernel: GFS2: fsid=nimble_cluster:gfs_test.3: fatal: I/O error >> Sep 20 16:05:22 node004 kernel: GFS2: fsid=nimble_cluster:gfs_test.3: block = 117980 >> Sep 20 16:05:22 node004 kernel: GFS2: fsid=nimble_cluster:gfs_test.3: function = log_write_header, file = fs/gfs2/log.c, line = 616 >> Sep 20 16:05:22 node004 kernel: GFS2: fsid=nimble_cluster:gfs_test.3: about to withdraw this file system >> Sep 20 16:05:22 node004 kernel: GFS2: fsid=nimble_cluster:gfs_test.3: telling LM to unmount >> Sep 20 16:05:22 node004 kernel: GFS2: fsid=nimble_cluster:gfs_test.3: withdrawn >> Sep 20 16:05:22 node004 kernel: Pid: 9820, comm: glock_workqueue Not tainted 2.6.32-279.el6.x86_64 #1 >> Sep 20 16:05:22 node004 kernel: Call Trace: >> Sep 20 16:05:22 node004 kernel: [<ffffffffa0962062>] ? gfs2_lm_withdraw+0x102/0x130 [gfs2] >> Sep 20 16:05:22 node004 kernel: [<ffffffff814fea28>] ? out_of_line_wait_on_bit+0x78/0x90 >> Sep 20 16:05:22 node004 kernel: [<ffffffff81092110>] ? wake_bit_function+0x0/0x50 >> Sep 20 16:05:22 node004 kernel: [<ffffffffa09620d0>] ? gfs2_io_error_bh_i+0x40/0x50 [gfs2] >> Sep 20 16:05:22 node004 kernel: [<ffffffff811adfb6>] ? __wait_on_buffer+0x26/0x30 >> Sep 20 16:05:22 node004 kernel: [<ffffffffa094a288>] ? log_write_header+0x3a8/0x490 [gfs2] >> Sep 20 16:05:22 node004 kernel: [<ffffffffa094a951>] ? gfs2_log_flush+0x301/0x6f0 [gfs2] >> Sep 20 16:05:22 node004 kernel: [<ffffffff810629d3>] ? dequeue_entity+0x113/0x2e0 >> Sep 20 16:05:22 node004 kernel: [<ffffffff810920d0>] ? autoremove_wake_function+0x0/0x40 >> Sep 20 16:05:22 node004 kernel: [<ffffffffa09477d0>] ? inode_go_sync+0x80/0x160 [gfs2] >> Sep 20 16:05:22 node004 kernel: [<ffffffffa0946336>] ? do_xmote+0x156/0x280 [gfs2] >> Sep 20 16:05:22 node004 kernel: [<ffffffff814fd830>] ? thread_return+0x4e/0x76e >> Sep 20 16:05:22 node004 kernel: [<ffffffffa0946551>] ? run_queue+0xf1/0x1d0 [gfs2] >> Sep 20 16:05:22 node004 kernel: [<ffffffffa0946d2a>] ? glock_work_func+0x7a/0x1b0 [gfs2] >> Sep 20 16:05:22 node004 kernel: [<ffffffffa0946cb0>] ? glock_work_func+0x0/0x1b0 [gfs2] >> Sep 20 16:05:22 node004 kernel: [<ffffffff8108c760>] ? worker_thread+0x170/0x2a0 >> Sep 20 16:05:22 node004 kernel: [<ffffffff810920d0>] ? autoremove_wake_function+0x0/0x40 >> Sep 20 16:05:22 node004 kernel: [<ffffffff8108c5f0>] ? worker_thread+0x0/0x2a0 >> Sep 20 16:05:22 node004 kernel: [<ffffffff81091d66>] ? kthread+0x96/0xa0 >> Sep 20 16:05:22 node004 kernel: [<ffffffff8100c14a>] ? child_rip+0xa/0x20 >> Sep 20 16:05:22 node004 kernel: [<ffffffff81091cd0>] ? kthread+0x0/0xa0 >> Sep 20 16:05:22 node004 kernel: [<ffffffff8100c140>] ? child_rip+0x0/0x20 >> Sep 20 16:08:22 node004 kernel: sd 3:0:0:0: timing out command, waited 180s >> Sep 20 16:08:22 node004 kernel: sd 3:0:0:0: [sdi] Unhandled error code >> Sep 20 16:08:22 node004 kernel: sd 3:0:0:0: [sdi] Result: hostbyte=DID_OK driverbyte=DRIVER_OK >> Sep 20 16:08:22 node004 kernel: sd 3:0:0:0: [sdi] CDB: Write(10): 2a 00 00 00 08 b0 00 00 08 00 >> Sep 20 16:08:22 node004 kernel: Buffer I/O error on device dm-6, logical block 22 >> Sep 20 16:08:22 node004 kernel: lost page write due to I/O error on dm-6 >> Sep 20 16:16:05 node004 xinetd[4416]: START: node-state pid=14578 from=::ffff:10.141.255.254 >> Sep 20 16:16:05 node004 xinetd[4416]: EXIT: node-state status=0 pid=14578 duration=0(sec) >> Sep 20 16:17:34 node004 xinetd[4416]: START: node-state pid=14653 from=::ffff:10.141.255.254 >> Sep 20 16:17:34 node004 xinetd[4416]: EXIT: node-state status=0 pid=14653 duration=0(sec) >> Sep 20 16:17:36 node004 xinetd[4416]: START: node-state pid=14671 from=::ffff:10.141.255.254 >> Sep 20 16:17:36 node004 xinetd[4416]: EXIT: node-state status=0 pid=14671 duration=0(sec) >> Sep 20 16:17:39 node004 xinetd[4416]: START: node-state pid=14690 from=::ffff:10.141.255.254 >> Sep 20 16:17:39 node004 xinetd[4416]: EXIT: node-state status=0 pid=14690 duration=0(sec) >> Sep 20 16:17:41 node004 xinetd[4416]: START: node-state pid=14708 from=::ffff:10.141.255.254 >> Sep 20 16:17:41 node004 xinetd[4416]: EXIT: node-state status=0 pid=14708 duration=0(sec) >> On Sep 20, 2012, at 4:14 PM, Andrew Holway wrote: >> >>> Aslo, >>> >>> IOzone gave this error: Error writing block 29813, fd= 3 >>> >>> GFS2: fsid=nimble_cluster:gfs_test.0: jid=3: Trying to acquire journal lock... >>> GFS2: fsid=nimble_cluster:gfs_test.0: jid=3: Looking at journal... >>> GFS2: fsid=nimble_cluster:gfs_test.0: jid=3: Acquiring the transaction lock... >>> GFS2: fsid=nimble_cluster:gfs_test.0: jid=3: Replaying journal... >>> >>> >>> GFS seemed to repair itself and things carried on working. >>> >>> thanks, >>> >>> Andrew >>> >>> On Sep 20, 2012, at 4:08 PM, Andrew Holway wrote: >>> >>>> Hello, >>>> >>>> I have set up a 4 node cluster. They are interconnected with an IPoIB (connected mode) >>>> >>>> Whist running a benchmark with IOzone I got the following errors: >>>> >>>> IO seems to have halted. >>>> >>>> Thanks, >>>> >>>> Andrew >>>> >>>> Sep 20 16:01:57 node001 kernel: INFO: task iozone:15816 blocked for more than 120 seconds. >>>> Sep 20 16:01:57 node001 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >>>> Sep 20 16:01:57 node001 kernel: iozone D 0000000000000011 0 15816 15374 0x00000080 >>>> Sep 20 16:01:57 node001 kernel: ffff880fd5ebbac8 0000000000000086 ffff880fd5ebba38 ffffffff81276b66 >>>> Sep 20 16:01:57 node001 kernel: 0000000000000096 ffff881ff95588d8 ffff880fd5ebba58 ffffffff81091f97 >>>> Sep 20 16:01:57 node001 kernel: ffff880fe238c638 ffff880fd5ebbfd8 000000000000fb88 ffff880fe238c638 >>>> Sep 20 16:01:57 node001 kernel: Call Trace: >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff81276b66>] ? __prop_inc_single+0x46/0x60 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff81091f97>] ? bit_waitqueue+0x17/0xd0 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa0943570>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2] >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa094357e>] gfs2_glock_holder_wait+0xe/0x20 [gfs2] >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff814fe97f>] __wait_on_bit+0x5f/0x90 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa0943570>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2] >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff814fea28>] out_of_line_wait_on_bit+0x78/0x90 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff81092110>] ? wake_bit_function+0x0/0x50 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff810538b6>] ? enqueue_task+0x66/0x80 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa09454f5>] gfs2_glock_wait+0x45/0x90 [gfs2] >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa09468f7>] gfs2_glock_nq+0x237/0x3d0 [gfs2] >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa09540ec>] gfs2_permission+0xec/0x100 [gfs2] >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa09540e4>] ? gfs2_permission+0xe4/0x100 [gfs2] >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118982d>] __link_path_walk+0xad/0x1030 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118aa3a>] path_walk+0x6a/0xe0 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118ac0b>] do_path_lookup+0x5b/0xa0 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118b877>] user_path_at+0x57/0xa0 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff811804ac>] vfs_fstatat+0x3c/0x80 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118061b>] vfs_stat+0x1b/0x20 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff81180644>] sys_newstat+0x24/0x50 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8150326e>] ? do_page_fault+0x3e/0xa0 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b >>>> Sep 20 16:01:57 node001 kernel: INFO: task iozone:15818 blocked for more than 120 seconds. >>>> Sep 20 16:01:57 node001 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >>>> Sep 20 16:01:57 node001 kernel: iozone D 0000000000000011 0 15818 15374 0x00000080 >>>> Sep 20 16:01:57 node001 kernel: ffff880fbe5b3ac8 0000000000000082 0000000000000000 ffff881ff95587a0 >>>> Sep 20 16:01:57 node001 kernel: ffff881000000002 ffff88100ee13048 00000000be5b3a58 00000040ffffffff >>>> Sep 20 16:01:57 node001 kernel: ffff88100eed1ab8 ffff880fbe5b3fd8 000000000000fb88 ffff88100eed1ab8 >>>> Sep 20 16:01:57 node001 kernel: Call Trace: >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa0943570>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2] >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa094357e>] gfs2_glock_holder_wait+0xe/0x20 [gfs2] >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff814fe97f>] __wait_on_bit+0x5f/0x90 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa0943570>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2] >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff814fea28>] out_of_line_wait_on_bit+0x78/0x90 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff81092110>] ? wake_bit_function+0x0/0x50 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa09454f5>] gfs2_glock_wait+0x45/0x90 [gfs2] >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa09468f7>] gfs2_glock_nq+0x237/0x3d0 [gfs2] >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa09540ec>] gfs2_permission+0xec/0x100 [gfs2] >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa09540e4>] ? gfs2_permission+0xe4/0x100 [gfs2] >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118982d>] __link_path_walk+0xad/0x1030 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118aa3a>] path_walk+0x6a/0xe0 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118ac0b>] do_path_lookup+0x5b/0xa0 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118b877>] user_path_at+0x57/0xa0 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff811804ac>] vfs_fstatat+0x3c/0x80 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118061b>] vfs_stat+0x1b/0x20 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff81180644>] sys_newstat+0x24/0x50 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8150326e>] ? do_page_fault+0x3e/0xa0 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b >>>> Sep 20 16:01:57 node001 kernel: INFO: task iozone:15820 blocked for more than 120 seconds. >>>> Sep 20 16:01:57 node001 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >>>> Sep 20 16:01:57 node001 kernel: iozone D 0000000000000008 0 15820 15374 0x00000080 >>>> Sep 20 16:01:57 node001 kernel: ffff880ffed7bac8 0000000000000086 0000000000000000 ffffffff81276b66 >>>> Sep 20 16:01:57 node001 kernel: 0000000000000002 ffff881ff95588d8 00000000fed7ba58 00000040ffffffff >>>> Sep 20 16:01:57 node001 kernel: ffff880fbdd51ab8 ffff880ffed7bfd8 000000000000fb88 ffff880fbdd51ab8 >>>> Sep 20 16:01:57 node001 kernel: Call Trace: >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff81276b66>] ? __prop_inc_single+0x46/0x60 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa0943570>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2] >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa094357e>] gfs2_glock_holder_wait+0xe/0x20 [gfs2] >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff814fe97f>] __wait_on_bit+0x5f/0x90 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa0943570>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2] >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff814fea28>] out_of_line_wait_on_bit+0x78/0x90 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff81092110>] ? wake_bit_function+0x0/0x50 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff810538b6>] ? enqueue_task+0x66/0x80 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa09454f5>] gfs2_glock_wait+0x45/0x90 [gfs2] >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa09468f7>] gfs2_glock_nq+0x237/0x3d0 [gfs2] >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa09540ec>] gfs2_permission+0xec/0x100 [gfs2] >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa09540e4>] ? gfs2_permission+0xe4/0x100 [gfs2] >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118982d>] __link_path_walk+0xad/0x1030 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118aa3a>] path_walk+0x6a/0xe0 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118ac0b>] do_path_lookup+0x5b/0xa0 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118b877>] user_path_at+0x57/0xa0 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff811804ac>] vfs_fstatat+0x3c/0x80 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118061b>] vfs_stat+0x1b/0x20 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff81180644>] sys_newstat+0x24/0x50 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8150326e>] ? do_page_fault+0x3e/0xa0 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b >>>> Sep 20 16:01:57 node001 kernel: INFO: task iozone:15822 blocked for more than 120 seconds. >>>> Sep 20 16:01:57 node001 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >>>> Sep 20 16:01:57 node001 kernel: iozone D 0000000000000011 0 15822 15374 0x00000080 >>>> Sep 20 16:01:57 node001 kernel: ffff880fd5dddac8 0000000000000086 0000000000000000 ffffffff81276b66 >>>> Sep 20 16:01:57 node001 kernel: 0000000000000096 ffff881ff95588d8 ffff880fd5ddda58 ffffffff81091f97 >>>> Sep 20 16:01:57 node001 kernel: ffff880fe238d098 ffff880fd5dddfd8 000000000000fb88 ffff880fe238d098 >>>> Sep 20 16:01:57 node001 kernel: Call Trace: >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff81276b66>] ? __prop_inc_single+0x46/0x60 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff81091f97>] ? bit_waitqueue+0x17/0xd0 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa0943570>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2] >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa094357e>] gfs2_glock_holder_wait+0xe/0x20 [gfs2] >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff814fe97f>] __wait_on_bit+0x5f/0x90 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa0943570>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2] >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff814fea28>] out_of_line_wait_on_bit+0x78/0x90 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff81092110>] ? wake_bit_function+0x0/0x50 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff810538b6>] ? enqueue_task+0x66/0x80 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa09454f5>] gfs2_glock_wait+0x45/0x90 [gfs2] >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa09468f7>] gfs2_glock_nq+0x237/0x3d0 [gfs2] >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa09540ec>] gfs2_permission+0xec/0x100 [gfs2] >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa09540e4>] ? gfs2_permission+0xe4/0x100 [gfs2] >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118982d>] __link_path_walk+0xad/0x1030 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118aa3a>] path_walk+0x6a/0xe0 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118ac0b>] do_path_lookup+0x5b/0xa0 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118b877>] user_path_at+0x57/0xa0 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff811804ac>] vfs_fstatat+0x3c/0x80 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118061b>] vfs_stat+0x1b/0x20 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff81180644>] sys_newstat+0x24/0x50 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8150326e>] ? do_page_fault+0x3e/0xa0 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b >>>> Sep 20 16:01:57 node001 kernel: INFO: task iozone:15824 blocked for more than 120 seconds. >>>> Sep 20 16:01:57 node001 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >>>> Sep 20 16:01:57 node001 kernel: iozone D 0000000000000011 0 15824 15374 0x00000080 >>>> Sep 20 16:01:57 node001 kernel: ffff880fbe5edac8 0000000000000086 0000000000000000 ffffffff81276b66 >>>> Sep 20 16:01:57 node001 kernel: 0000000000000002 ffff881ff95588d8 00000000be5eda58 00000040ffffffff >>>> Sep 20 16:01:57 node001 kernel: ffff880ff69085f8 ffff880fbe5edfd8 000000000000fb88 ffff880ff69085f8 >>>> Sep 20 16:01:57 node001 kernel: Call Trace: >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff81276b66>] ? __prop_inc_single+0x46/0x60 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa0943570>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2] >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa094357e>] gfs2_glock_holder_wait+0xe/0x20 [gfs2] >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff814fe97f>] __wait_on_bit+0x5f/0x90 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa0943570>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2] >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff814fea28>] out_of_line_wait_on_bit+0x78/0x90 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff81092110>] ? wake_bit_function+0x0/0x50 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff810538b6>] ? enqueue_task+0x66/0x80 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa09454f5>] gfs2_glock_wait+0x45/0x90 [gfs2] >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa09468f7>] gfs2_glock_nq+0x237/0x3d0 [gfs2] >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa09540ec>] gfs2_permission+0xec/0x100 [gfs2] >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa09540e4>] ? gfs2_permission+0xe4/0x100 [gfs2] >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118982d>] __link_path_walk+0xad/0x1030 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118aa3a>] path_walk+0x6a/0xe0 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118ac0b>] do_path_lookup+0x5b/0xa0 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118b877>] user_path_at+0x57/0xa0 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff811804ac>] vfs_fstatat+0x3c/0x80 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118061b>] vfs_stat+0x1b/0x20 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff81180644>] sys_newstat+0x24/0x50 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8150326e>] ? do_page_fault+0x3e/0xa0 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b >>>> Sep 20 16:01:57 node001 kernel: INFO: task iozone:15826 blocked for more than 120 seconds. >>>> Sep 20 16:01:57 node001 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >>>> Sep 20 16:01:57 node001 kernel: iozone D 0000000000000011 0 15826 15374 0x00000080 >>>> Sep 20 16:01:57 node001 kernel: ffff880fbe7cfac8 0000000000000086 0000000000000000 ffffffff81276b66 >>>> Sep 20 16:01:57 node001 kernel: 0000000000000096 ffff881ff95588d8 ffff880fbe7cfa58 ffffffff81091f97 >>>> Sep 20 16:01:57 node001 kernel: ffff88100ddcbab8 ffff880fbe7cffd8 000000000000fb88 ffff88100ddcbab8 >>>> Sep 20 16:01:57 node001 kernel: Call Trace: >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff81276b66>] ? __prop_inc_single+0x46/0x60 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff81091f97>] ? bit_waitqueue+0x17/0xd0 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa0943570>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2] >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa094357e>] gfs2_glock_holder_wait+0xe/0x20 [gfs2] >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff814fe97f>] __wait_on_bit+0x5f/0x90 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa0943570>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2] >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff814fea28>] out_of_line_wait_on_bit+0x78/0x90 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff81092110>] ? wake_bit_function+0x0/0x50 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff810538b6>] ? enqueue_task+0x66/0x80 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa09454f5>] gfs2_glock_wait+0x45/0x90 [gfs2] >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa09468f7>] gfs2_glock_nq+0x237/0x3d0 [gfs2] >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa09540ec>] gfs2_permission+0xec/0x100 [gfs2] >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa09540e4>] ? gfs2_permission+0xe4/0x100 [gfs2] >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118982d>] __link_path_walk+0xad/0x1030 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118aa3a>] path_walk+0x6a/0xe0 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118ac0b>] do_path_lookup+0x5b/0xa0 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118b877>] user_path_at+0x57/0xa0 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff811804ac>] vfs_fstatat+0x3c/0x80 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118061b>] vfs_stat+0x1b/0x20 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff81180644>] sys_newstat+0x24/0x50 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8150326e>] ? do_page_fault+0x3e/0xa0 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b >>>> Sep 20 16:01:57 node001 kernel: INFO: task iozone:15828 blocked for more than 120 seconds. >>>> Sep 20 16:01:57 node001 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >>>> Sep 20 16:01:57 node001 kernel: iozone D 0000000000000011 0 15828 15374 0x00000080 >>>> Sep 20 16:01:57 node001 kernel: ffff88100684bac8 0000000000000086 0000000000000000 ffffffff81276b66 >>>> Sep 20 16:01:57 node001 kernel: 0000000000000002 ffff881ff95588d8 000000000684ba58 00000040ffffffff >>>> Sep 20 16:01:57 node001 kernel: ffff88100edc7af8 ffff88100684bfd8 000000000000fb88 ffff88100edc7af8 >>>> Sep 20 16:01:57 node001 kernel: Call Trace: >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff81276b66>] ? __prop_inc_single+0x46/0x60 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa0943570>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2] >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa094357e>] gfs2_glock_holder_wait+0xe/0x20 [gfs2] >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff814fe97f>] __wait_on_bit+0x5f/0x90 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa0943570>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2] >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff814fea28>] out_of_line_wait_on_bit+0x78/0x90 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff81092110>] ? wake_bit_function+0x0/0x50 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff810538b6>] ? enqueue_task+0x66/0x80 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa09454f5>] gfs2_glock_wait+0x45/0x90 [gfs2] >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa09468f7>] gfs2_glock_nq+0x237/0x3d0 [gfs2] >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa09540ec>] gfs2_permission+0xec/0x100 [gfs2] >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa09540e4>] ? gfs2_permission+0xe4/0x100 [gfs2] >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118982d>] __link_path_walk+0xad/0x1030 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118aa3a>] path_walk+0x6a/0xe0 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118ac0b>] do_path_lookup+0x5b/0xa0 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118b877>] user_path_at+0x57/0xa0 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff811804ac>] vfs_fstatat+0x3c/0x80 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118061b>] vfs_stat+0x1b/0x20 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff81180644>] sys_newstat+0x24/0x50 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8150326e>] ? do_page_fault+0x3e/0xa0 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b >>>> Sep 20 16:01:57 node001 kernel: INFO: task iozone:15830 blocked for more than 120 seconds. >>>> Sep 20 16:01:57 node001 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >>>> Sep 20 16:01:57 node001 kernel: iozone D 0000000000000000 0 15830 15374 0x00000080 >>>> Sep 20 16:01:57 node001 kernel: ffff880fbdd0fac8 0000000000000082 0000000000000000 ffffffff81276b66 >>>> Sep 20 16:01:57 node001 kernel: 0000000000000002 ffff881ff95588d8 00000000bdd0fa58 00000040ffffffff >>>> Sep 20 16:01:57 node001 kernel: ffff88100de93098 ffff880fbdd0ffd8 000000000000fb88 ffff88100de93098 >>>> Sep 20 16:01:57 node001 kernel: Call Trace: >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff81276b66>] ? __prop_inc_single+0x46/0x60 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa0943570>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2] >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa094357e>] gfs2_glock_holder_wait+0xe/0x20 [gfs2] >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff814fe97f>] __wait_on_bit+0x5f/0x90 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa0943570>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2] >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff814fea28>] out_of_line_wait_on_bit+0x78/0x90 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff81092110>] ? wake_bit_function+0x0/0x50 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff810538b6>] ? enqueue_task+0x66/0x80 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa09454f5>] gfs2_glock_wait+0x45/0x90 [gfs2] >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa09468f7>] gfs2_glock_nq+0x237/0x3d0 [gfs2] >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa09540ec>] gfs2_permission+0xec/0x100 [gfs2] >>>> Sep 20 16:01:57 node001 kernel: [<ffffffffa09540e4>] ? gfs2_permission+0xe4/0x100 [gfs2] >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118982d>] __link_path_walk+0xad/0x1030 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118aa3a>] path_walk+0x6a/0xe0 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118ac0b>] do_path_lookup+0x5b/0xa0 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118b877>] user_path_at+0x57/0xa0 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff811804ac>] vfs_fstatat+0x3c/0x80 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8118061b>] vfs_stat+0x1b/0x20 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff81180644>] sys_newstat+0x24/0x50 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8150326e>] ? do_page_fault+0x3e/0xa0 >>>> Sep 20 16:01:57 node001 kernel: [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b >>>> >>>> >>> >> >> >> > > -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster