Dear Neil, The same Issue is reproducible in the latest upstream kernel also. Tested in "3.17.6" latest stable upstream kernel and find the same issue. [root@root ~]# modinfo raid456 filename: /lib/modules/3.17.6/kernel/drivers/md/raid456.ko alias: raid6 alias: raid5 alias: md-level-6 alias: md-raid6 alias: md-personality-8 alias: md-level-4 alias: md-level-5 alias: md-raid4 alias: md-raid5 alias: md-personality-4 description: RAID4/5/6 (striping with parity) personality for MD license: GPL srcversion: 0EEF680023FDC7410F7989A depends: async_raid6_recov,async_pq,async_tx,async_memcpy,async_xor intree: Y vermagic: 3.17.6 SMP mod_unload modversions parm: devices_handle_discard_safely:Set to Y if all devices in each array reliably return zeroes on reads from discarded regions (bool) Thanks, Manibalan. -----Original Message----- From: Manibalan P Sent: Wednesday, December 17, 2014 12:01 PM To: 'linux-raid' Cc: 'NeilBrown' Subject: RE: md_raid5 using 100% CPU and hang with status resync=PENDING, if a drive is removed during initialization Dear Neil, We are facing IO struck issue with raid5 in the following scenario. (please see the attachment for the complete information) In RAID5 array, if a drive is removed while initialization and the same time if IO is happening to that md. Then IO is getting struck, and md_raid5 thread is using 100 % of CPU. Also the md state showing as resync=PENDING Kernel : Issue found in the following kernels RHEL 6.5 (2.6.32-431.el6.x86_64) CentOS 7 (kernel-3.10.0-123.13.1.el7.x86_64) Steps to Reproduce the issue: 1. Created a raid 5 md with 4 drives using the below mdadm command. mdadm -C /dev/md0 -c 64 -l 5 -f -n 4 -e 1.2 /dev/sdb6 /dev/sdc6 /dev/sdd6 /dev/sde6 2. Make the md writable mdadm –readwrite /dev/md0 3. Now md will start initialization 4. Run FIO Tool, the the below said configuration /usr/bin/fio --name=md0 --filename=/dev/md0 --thread --numjobs=10 --direct=1 --group_reporting --unlink=0 --loops=1 --offset=0 --randrepeat=1 --norandommap --scramble_buffers=1 --stonewall --ioengine=libaio --rw=randwrite --bs=8704 --iodepth=4000 --runtime=3000 --blockalign=512 4. During MD initialzing, remove a drive(either using MDADM set faulty/remove or remove manually) 5. Now the IO will struck, and cat /proc/mdstat shows states with resync=PENDING --------------------------------------------------------------------------------------------- top - output show, md_raid5 using 100% cpu top - 17:55:06 up 1:09, 3 users, load average: 11.98, 8.53, 3.99 PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2690 root 20 0 0 0 0 R 100.0 0.0 6:44.41 md0_raid5 --------------------------------------------------------------------------------------------- dmesg - show the stack trace INFO: task fio:2715 blocked for more than 120 seconds. Not tainted 2.6.32-431.el6.x86_64 #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. fio D 000000000000000a 0 2715 2654 0x00000080 ffff88043b623598 0000000000000082 0000000000000000 ffffffff81058d53 ffff88043b623548 ffff880230e49cc0 ffff8802389aa228 ffff88043b2ad1b8 ffff88043b40b098 ffff88043b623fd8 000000000000fbc8 ffff88043b40b098 Call Trace: [<ffffffff81058d53>] ? __wake_up+0x53/0x70 [<ffffffffa0304146>] get_active_stripe+0x236/0x830 [raid456] [<ffffffff81065df0>] ? default_wake_function+0x0/0x20 [<ffffffff8109b5ce>] ? prepare_to_wait+0x4e/0x80 [<ffffffffa0308e15>] make_request+0x1b5/0xc6c [raid456] [<ffffffff8109b2a0>] ? autoremove_wake_function+0x0/0x40 [<ffffffff8140fa39>] ? md_wakeup_thread+0x39/0x70 [<ffffffff81415b41>] md_make_request+0xe1/0x230 [<ffffffffa0308f66>] ? make_request+0x306/0xc6c [raid456] [<ffffffff81266c50>] generic_make_request+0x240/0x5a0 [<ffffffff811220e5>] ? mempool_alloc_slab+0x15/0x20 [<ffffffff81122283>] ? mempool_alloc+0x63/0x140 [<ffffffff81267020>] submit_bio+0x70/0x120 [<ffffffff811c767a>] do_direct_IO+0x7ca/0xfa0 [<ffffffff811c8196>] __blockdev_direct_IO_newtrunc+0x346/0x1270 [<ffffffff811c4330>] ? blkdev_get_block+0x0/0x20 [<ffffffff811c9137>] __blockdev_direct_IO+0x77/0xe0 [<ffffffff811c4330>] ? blkdev_get_block+0x0/0x20 [<ffffffff811c53b7>] blkdev_direct_IO+0x57/0x60 [<ffffffff811c4330>] ? blkdev_get_block+0x0/0x20 [<ffffffff81120552>] generic_file_direct_write+0xc2/0x190 [<ffffffff81121e71>] __generic_file_aio_write+0x3a1/0x490 [<ffffffff811d64c0>] ? aio_read_evt+0xa0/0x170 [<ffffffff811c490c>] blkdev_aio_write+0x3c/0xa0 [<ffffffff811c48d0>] ? blkdev_aio_write+0x0/0xa0 [<ffffffff811d4f64>] aio_rw_vect_retry+0x84/0x200 [<ffffffff811d6924>] aio_run_iocb+0x64/0x170 [<ffffffff811d7d51>] do_io_submit+0x291/0x920 [<ffffffff811d83f0>] sys_io_submit+0x10/0x20 [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b INFO: task fio:2717 blocked for more than 120 seconds. Not tainted 2.6.32-431.el6.x86_64 #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. fio D 0000000000000004 0 2717 2654 0x00000080 ffff880439e97698 0000000000000082 ffff880439e97628 ffffffff81058d53 ffff880439e97648 ffff880230e49cc0 ffff8802389aa228 ffff88043b2ad1b8 ffff88043b0adab8 ffff880439e97fd8 000000000000fbc8 ffff88043b0adab8 Call Trace: [<ffffffff81058d53>] ? __wake_up+0x53/0x70 [<ffffffffa030334b>] ? md_raid5_unplug_device+0x7b/0x100 [raid456] [<ffffffffa0304146>] get_active_stripe+0x236/0x830 [raid456] [<ffffffff81065df0>] ? default_wake_function+0x0/0x20 [<ffffffff8109b5ce>] ? prepare_to_wait+0x4e/0x80 [<ffffffffa0308e15>] make_request+0x1b5/0xc6c [raid456] [<ffffffff8109b2a0>] ? autoremove_wake_function+0x0/0x40 [<ffffffff811220e5>] ? mempool_alloc_slab+0x15/0x20 [<ffffffff81415b41>] md_make_request+0xe1/0x230 [<ffffffff811c32f0>] ? __bio_add_page+0x110/0x230 [<ffffffff81266c50>] generic_make_request+0x240/0x5a0 [<ffffffff811c742c>] ? do_direct_IO+0x57c/0xfa0 [<ffffffff81267020>] submit_bio+0x70/0x120 [<ffffffff811c8e50>] __blockdev_direct_IO_newtrunc+0x1000/0x1270 [<ffffffff811c4330>] ? blkdev_get_block+0x0/0x20 [<ffffffff811c9137>] __blockdev_direct_IO+0x77/0xe0 [<ffffffff811c4330>] ? blkdev_get_block+0x0/0x20 [<ffffffff811c53b7>] blkdev_direct_IO+0x57/0x60 [<ffffffff811c4330>] ? blkdev_get_block+0x0/0x20 [<ffffffff81120552>] generic_file_direct_write+0xc2/0x190 [<ffffffff81121e71>] __generic_file_aio_write+0x3a1/0x490 [<ffffffff811d64c0>] ? aio_read_evt+0xa0/0x170 [<ffffffff811c490c>] blkdev_aio_write+0x3c/0xa0 [<ffffffff811c48d0>] ? blkdev_aio_write+0x0/0xa0 [<ffffffff811d4f64>] aio_rw_vect_retry+0x84/0x200 [<ffffffff811d6924>] aio_run_iocb+0x64/0x170 [<ffffffff811d7d51>] do_io_submit+0x291/0x920 [<ffffffff811d83f0>] sys_io_submit+0x10/0x20 [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b INFO: task fio:2718 blocked for more than 120 seconds. Not tainted 2.6.32-431.el6.x86_64 #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. fio D 0000000000000005 0 2718 2654 0x00000080 ffff88043bc13698 0000000000000082 ffff88043bc13628 ffffffff81058d53 ffff88043bc13648 ffff880230e49cc0 ffff8802389aa228 ffff88043b2ad1b8 ffff88043b0ad058 ffff88043bc13fd8 000000000000fbc8 ffff88043b0ad058 Call Trace: [<ffffffff81058d53>] ? __wake_up+0x53/0x70 [<ffffffffa030334b>] ? md_raid5_unplug_device+0x7b/0x100 [raid456] [<ffffffffa0304146>] get_active_stripe+0x236/0x830 [raid456] [<ffffffff81065df0>] ? default_wake_function+0x0/0x20 [<ffffffff8109b5ce>] ? prepare_to_wait+0x4e/0x80 [<ffffffffa0308e15>] make_request+0x1b5/0xc6c [raid456] [<ffffffff8109b2a0>] ? autoremove_wake_function+0x0/0x40 [<ffffffff811220e5>] ? mempool_alloc_slab+0x15/0x20 [<ffffffff81415b41>] md_make_request+0xe1/0x230 [<ffffffff811c3fd2>] ? bvec_alloc_bs+0x62/0x110 [<ffffffff811c32f0>] ? __bio_add_page+0x110/0x230 [<ffffffff81266c50>] generic_make_request+0x240/0x5a0 [<ffffffff811c742c>] ? do_direct_IO+0x57c/0xfa0 [<ffffffff81267020>] submit_bio+0x70/0x120 [<ffffffff811c8e50>] __blockdev_direct_IO_newtrunc+0x1000/0x1270 [<ffffffff811c4330>] ? blkdev_get_block+0x0/0x20 [<ffffffff811c9137>] __blockdev_direct_IO+0x77/0xe0 [<ffffffff811c4330>] ? blkdev_get_block+0x0/0x20 [<ffffffff811c53b7>] blkdev_direct_IO+0x57/0x60 [<ffffffff811c4330>] ? blkdev_get_block+0x0/0x20 [<ffffffff81120552>] generic_file_direct_write+0xc2/0x190 [<ffffffff81121e71>] __generic_file_aio_write+0x3a1/0x490 [<ffffffff811d64c0>] ? aio_read_evt+0xa0/0x170 [<ffffffff811c490c>] blkdev_aio_write+0x3c/0xa0 [<ffffffff811c48d0>] ? blkdev_aio_write+0x0/0xa0 [<ffffffff811d4f64>] aio_rw_vect_retry+0x84/0x200 [<ffffffff811d6924>] aio_run_iocb+0x64/0x170 [<ffffffff811d7d51>] do_io_submit+0x291/0x920 [<ffffffff811d83f0>] sys_io_submit+0x10/0x20 [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b INFO: task fio:2719 blocked for more than 120 seconds. Not tainted 2.6.32-431.el6.x86_64 #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. fio D 0000000000000001 0 2719 2654 0x00000080 ffff880439ebb698 0000000000000082 ffff880439ebb628 ffffffff81058d53 ffff880439ebb648 ffff880230e49cc0 ffff8802389aa228 ffff88043b2ad1b8 ffff88043b0ac5f8 ffff880439ebbfd8 000000000000fbc8 ffff88043b0ac5f8 Call Trace: [<ffffffff81058d53>] ? __wake_up+0x53/0x70 [<ffffffffa030334b>] ? md_raid5_unplug_device+0x7b/0x100 [raid456] [<ffffffffa0304146>] get_active_stripe+0x236/0x830 [raid456] [<ffffffff81065df0>] ? default_wake_function+0x0/0x20 [<ffffffff8109b5ce>] ? prepare_to_wait+0x4e/0x80 [<ffffffffa0308e15>] make_request+0x1b5/0xc6c [raid456] [<ffffffff8109b2a0>] ? autoremove_wake_function+0x0/0x40 [<ffffffff811220e5>] ? mempool_alloc_slab+0x15/0x20 [<ffffffff81415b41>] md_make_request+0xe1/0x230 [<ffffffff811c3fd2>] ? bvec_alloc_bs+0x62/0x110 [<ffffffff811c32f0>] ? __bio_add_page+0x110/0x230 [<ffffffff81266c50>] generic_make_request+0x240/0x5a0 [<ffffffff811c742c>] ? do_direct_IO+0x57c/0xfa0 [<ffffffff81267020>] submit_bio+0x70/0x120 [<ffffffff811c8acd>] __blockdev_direct_IO_newtrunc+0xc7d/0x1270 [<ffffffff811c4330>] ? blkdev_get_block+0x0/0x20 [<ffffffff811c9137>] __blockdev_direct_IO+0x77/0xe0 [<ffffffff811c4330>] ? blkdev_get_block+0x0/0x20 [<ffffffff811c53b7>] blkdev_direct_IO+0x57/0x60 [<ffffffff811c4330>] ? blkdev_get_block+0x0/0x20 [<ffffffff81120552>] generic_file_direct_write+0xc2/0x190 [<ffffffff81121e71>] __generic_file_aio_write+0x3a1/0x490 [<ffffffff811d64c0>] ? aio_read_evt+0xa0/0x170 [<ffffffff811c490c>] blkdev_aio_write+0x3c/0xa0 [<ffffffff811c48d0>] ? blkdev_aio_write+0x0/0xa0 [<ffffffff811d4f64>] aio_rw_vect_retry+0x84/0x200 [<ffffffff811d6924>] aio_run_iocb+0x64/0x170 [<ffffffff811d7d51>] do_io_submit+0x291/0x920 [<ffffffff811d83f0>] sys_io_submit+0x10/0x20 [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b INFO: task fio:2720 blocked for more than 120 seconds. Not tainted 2.6.32-431.el6.x86_64 #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. fio D 0000000000000008 0 2720 2654 0x00000080 ffff88043b8cf698 0000000000000082 ffff88043b8cf628 ffffffff81058d53 ffff88043b8cf648 ffff880230e49cc0 ffff8802389aa228 ffff88043b2ad1b8 ffff880439e89af8 ffff88043b8cffd8 000000000000fbc8 ffff880439e89af8 Call Trace: [<ffffffff81058d53>] ? __wake_up+0x53/0x70 [<ffffffffa030334b>] ? md_raid5_unplug_device+0x7b/0x100 [raid456] [<ffffffffa0304146>] get_active_stripe+0x236/0x830 [raid456] [<ffffffff81065df0>] ? default_wake_function+0x0/0x20 [<ffffffff8109b5ce>] ? prepare_to_wait+0x4e/0x80 [<ffffffffa0308e15>] make_request+0x1b5/0xc6c [raid456] [<ffffffff8109b2a0>] ? autoremove_wake_function+0x0/0x40 [<ffffffff811220e5>] ? mempool_alloc_slab+0x15/0x20 [<ffffffff81415b41>] md_make_request+0xe1/0x230 [<ffffffff811c3fd2>] ? bvec_alloc_bs+0x62/0x110 [<ffffffff811c32f0>] ? __bio_add_page+0x110/0x230 [<ffffffff81266c50>] generic_make_request+0x240/0x5a0 [<ffffffff811c742c>] ? do_direct_IO+0x57c/0xfa0 [<ffffffff81267020>] submit_bio+0x70/0x120 [<ffffffff811c8acd>] __blockdev_direct_IO_newtrunc+0xc7d/0x1270 [<ffffffff811c4330>] ? blkdev_get_block+0x0/0x20 [<ffffffff811c9137>] __blockdev_direct_IO+0x77/0xe0 [<ffffffff811c4330>] ? blkdev_get_block+0x0/0x20 [<ffffffff811c53b7>] blkdev_direct_IO+0x57/0x60 [<ffffffff811c4330>] ? blkdev_get_block+0x0/0x20 [<ffffffff81120552>] generic_file_direct_write+0xc2/0x190 [<ffffffff81121e71>] __generic_file_aio_write+0x3a1/0x490 [<ffffffff811d64c0>] ? aio_read_evt+0xa0/0x170 [<ffffffff811c490c>] blkdev_aio_write+0x3c/0xa0 [<ffffffff811c48d0>] ? blkdev_aio_write+0x0/0xa0 [<ffffffff811d4f64>] aio_rw_vect_retry+0x84/0x200 [<ffffffff811d6924>] aio_run_iocb+0x64/0x170 [<ffffffff811d7d51>] do_io_submit+0x291/0x920 [<ffffffff811d83f0>] sys_io_submit+0x10/0x20 [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b INFO: task fio:2721 blocked for more than 120 seconds. Not tainted 2.6.32-431.el6.x86_64 #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. fio D 0000000000000000 0 2721 2654 0x00000080 ffff88043b047698 0000000000000082 ffff88043b047628 ffffffff81058d53 ffff88043b047648 ffff880230e49cc0 ffff8802389aa228 ffff88043b2ad1b8 ffff880439e89098 ffff88043b047fd8 000000000000fbc8 ffff880439e89098 Call Trace: [<ffffffff81058d53>] ? __wake_up+0x53/0x70 [<ffffffffa030334b>] ? md_raid5_unplug_device+0x7b/0x100 [raid456] [<ffffffffa0304146>] get_active_stripe+0x236/0x830 [raid456] [<ffffffff81065df0>] ? default_wake_function+0x0/0x20 [<ffffffff8109b5ce>] ? prepare_to_wait+0x4e/0x80 [<ffffffffa0308e15>] make_request+0x1b5/0xc6c [raid456] [<ffffffff8109b2a0>] ? autoremove_wake_function+0x0/0x40 [<ffffffff811220e5>] ? mempool_alloc_slab+0x15/0x20 [<ffffffff81415b41>] md_make_request+0xe1/0x230 [<ffffffff811c3fd2>] ? bvec_alloc_bs+0x62/0x110 [<ffffffff811c32f0>] ? __bio_add_page+0x110/0x230 [<ffffffff81266c50>] generic_make_request+0x240/0x5a0 [<ffffffff811c742c>] ? do_direct_IO+0x57c/0xfa0 [<ffffffff81267020>] submit_bio+0x70/0x120 [<ffffffff811c8acd>] __blockdev_direct_IO_newtrunc+0xc7d/0x1270 [<ffffffff811c4330>] ? blkdev_get_block+0x0/0x20 [<ffffffff811c9137>] __blockdev_direct_IO+0x77/0xe0 [<ffffffff811c4330>] ? blkdev_get_block+0x0/0x20 [<ffffffff811c53b7>] blkdev_direct_IO+0x57/0x60 [<ffffffff811c4330>] ? blkdev_get_block+0x0/0x20 [<ffffffff81120552>] generic_file_direct_write+0xc2/0x190 [<ffffffff81121e71>] __generic_file_aio_write+0x3a1/0x490 [<ffffffff811d64c0>] ? aio_read_evt+0xa0/0x170 [<ffffffff811c490c>] blkdev_aio_write+0x3c/0xa0 [<ffffffff811c48d0>] ? blkdev_aio_write+0x0/0xa0 [<ffffffff811d4f64>] aio_rw_vect_retry+0x84/0x200 [<ffffffff811d6924>] aio_run_iocb+0x64/0x170 [<ffffffff811d7d51>] do_io_submit+0x291/0x920 [<ffffffff811d83f0>] sys_io_submit+0x10/0x20 [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b INFO: task fio:2722 blocked for more than 120 seconds. Not tainted 2.6.32-431.el6.x86_64 #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. fio D 0000000000000000 0 2722 2654 0x00000080 ffff880439ea3698 0000000000000082 ffff880439ea3628 ffffffff81058d53 ffff880439ea3648 ffff880230e49cc0 ffff8802389aa228 ffff88043b2ad1b8 ffff880439e88638 ffff880439ea3fd8 000000000000fbc8 ffff880439e88638 Call Trace: [<ffffffff81058d53>] ? __wake_up+0x53/0x70 [<ffffffffa030334b>] ? md_raid5_unplug_device+0x7b/0x100 [raid456] [<ffffffffa0304146>] get_active_stripe+0x236/0x830 [raid456] [<ffffffff81065df0>] ? default_wake_function+0x0/0x20 [<ffffffff8109b5ce>] ? prepare_to_wait+0x4e/0x80 [<ffffffffa0308e15>] make_request+0x1b5/0xc6c [raid456] [<ffffffff8109b2a0>] ? autoremove_wake_function+0x0/0x40 [<ffffffff811220e5>] ? mempool_alloc_slab+0x15/0x20 [<ffffffff81415b41>] md_make_request+0xe1/0x230 [<ffffffff811c3fd2>] ? bvec_alloc_bs+0x62/0x110 [<ffffffff811c32f0>] ? __bio_add_page+0x110/0x230 [<ffffffff81266c50>] generic_make_request+0x240/0x5a0 [<ffffffff811c742c>] ? do_direct_IO+0x57c/0xfa0 [<ffffffff81267020>] submit_bio+0x70/0x120 [<ffffffff811c8acd>] __blockdev_direct_IO_newtrunc+0xc7d/0x1270 [<ffffffff811c4330>] ? blkdev_get_block+0x0/0x20 [<ffffffff811c9137>] __blockdev_direct_IO+0x77/0xe0 [<ffffffff811c4330>] ? blkdev_get_block+0x0/0x20 [<ffffffff811c53b7>] blkdev_direct_IO+0x57/0x60 [<ffffffff811c4330>] ? blkdev_get_block+0x0/0x20 [<ffffffff81120552>] generic_file_direct_write+0xc2/0x190 [<ffffffff81121e71>] __generic_file_aio_write+0x3a1/0x490 [<ffffffff811d64c0>] ? aio_read_evt+0xa0/0x170 [<ffffffff811c490c>] blkdev_aio_write+0x3c/0xa0 [<ffffffff811c48d0>] ? blkdev_aio_write+0x0/0xa0 [<ffffffff811d4f64>] aio_rw_vect_retry+0x84/0x200 [<ffffffff811d6924>] aio_run_iocb+0x64/0x170 [<ffffffff811d7d51>] do_io_submit+0x291/0x920 [<ffffffff811d83f0>] sys_io_submit+0x10/0x20 [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b INFO: task fio:2723 blocked for more than 120 seconds. Not tainted 2.6.32-431.el6.x86_64 #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. fio D 0000000000000006 0 2723 2654 0x00000080 ffff88043bf5f698 0000000000000082 ffff88043bf5f628 ffffffff81058d53 ffff88043bf5f648 ffff880230e49cc0 ffff8802389aa228 ffff88043b2ad1b8 ffff88043a183ab8 ffff88043bf5ffd8 000000000000fbc8 ffff88043a183ab8 Call Trace: [<ffffffff81058d53>] ? __wake_up+0x53/0x70 [<ffffffffa030334b>] ? md_raid5_unplug_device+0x7b/0x100 [raid456] [<ffffffffa0304146>] get_active_stripe+0x236/0x830 [raid456] [<ffffffff81065df0>] ? default_wake_function+0x0/0x20 [<ffffffff8109b5ce>] ? prepare_to_wait+0x4e/0x80 [<ffffffffa0308e15>] make_request+0x1b5/0xc6c [raid456] [<ffffffff8109b2a0>] ? autoremove_wake_function+0x0/0x40 [<ffffffff811220e5>] ? mempool_alloc_slab+0x15/0x20 [<ffffffff81415b41>] md_make_request+0xe1/0x230 [<ffffffff811c3fd2>] ? bvec_alloc_bs+0x62/0x110 [<ffffffff811c32f0>] ? __bio_add_page+0x110/0x230 [<ffffffff81266c50>] generic_make_request+0x240/0x5a0 [<ffffffff811c742c>] ? do_direct_IO+0x57c/0xfa0 [<ffffffff81267020>] submit_bio+0x70/0x120 [<ffffffff811c8acd>] __blockdev_direct_IO_newtrunc+0xc7d/0x1270 [<ffffffff811c4330>] ? blkdev_get_block+0x0/0x20 [<ffffffff811c9137>] __blockdev_direct_IO+0x77/0xe0 [<ffffffff811c4330>] ? blkdev_get_block+0x0/0x20 [<ffffffff811c53b7>] blkdev_direct_IO+0x57/0x60 [<ffffffff811c4330>] ? blkdev_get_block+0x0/0x20 [<ffffffff81120552>] generic_file_direct_write+0xc2/0x190 [<ffffffff81121e71>] __generic_file_aio_write+0x3a1/0x490 [<ffffffff811d64c0>] ? aio_read_evt+0xa0/0x170 [<ffffffff811c490c>] blkdev_aio_write+0x3c/0xa0 [<ffffffff811c48d0>] ? blkdev_aio_write+0x0/0xa0 [<ffffffff811d4f64>] aio_rw_vect_retry+0x84/0x200 [<ffffffff811d6924>] aio_run_iocb+0x64/0x170 [<ffffffff811d7d51>] do_io_submit+0x291/0x920 [<ffffffff811d83f0>] sys_io_submit+0x10/0x20 [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b INFO: task fio:2724 blocked for more than 120 seconds. Not tainted 2.6.32-431.el6.x86_64 #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. fio D 000000000000000b 0 2724 2654 0x00000080 ffff88043be05698 0000000000000082 ffff88043be05628 ffffffff81058d53 ffff88043be05648 ffff880230e49cc0 ffff8802389aa228 ffff88043b2ad1b8 ffff88043a183058 ffff88043be05fd8 000000000000fbc8 ffff88043a183058 Call Trace: [<ffffffff81058d53>] ? __wake_up+0x53/0x70 [<ffffffffa030334b>] ? md_raid5_unplug_device+0x7b/0x100 [raid456] [<ffffffffa0304146>] get_active_stripe+0x236/0x830 [raid456] [<ffffffff81065df0>] ? default_wake_function+0x0/0x20 [<ffffffff8109b5ce>] ? prepare_to_wait+0x4e/0x80 [<ffffffffa0308e15>] make_request+0x1b5/0xc6c [raid456] [<ffffffff8109b2a0>] ? autoremove_wake_function+0x0/0x40 [<ffffffff811220e5>] ? mempool_alloc_slab+0x15/0x20 [<ffffffff81415b41>] md_make_request+0xe1/0x230 [<ffffffff811c3fd2>] ? bvec_alloc_bs+0x62/0x110 [<ffffffff811c32f0>] ? __bio_add_page+0x110/0x230 [<ffffffff81266c50>] generic_make_request+0x240/0x5a0 [<ffffffff811c742c>] ? do_direct_IO+0x57c/0xfa0 [<ffffffff81267020>] submit_bio+0x70/0x120 [<ffffffff811c8acd>] __blockdev_direct_IO_newtrunc+0xc7d/0x1270 [<ffffffff811c4330>] ? blkdev_get_block+0x0/0x20 [<ffffffff811c9137>] __blockdev_direct_IO+0x77/0xe0 [<ffffffff811c4330>] ? blkdev_get_block+0x0/0x20 [<ffffffff811c53b7>] blkdev_direct_IO+0x57/0x60 [<ffffffff811c4330>] ? blkdev_get_block+0x0/0x20 [<ffffffff81120552>] generic_file_direct_write+0xc2/0x190 [<ffffffff81121e71>] __generic_file_aio_write+0x3a1/0x490 [<ffffffff811d64c0>] ? aio_read_evt+0xa0/0x170 [<ffffffff811c490c>] blkdev_aio_write+0x3c/0xa0 [<ffffffff811c48d0>] ? blkdev_aio_write+0x0/0xa0 [<ffffffff811d4f64>] aio_rw_vect_retry+0x84/0x200 [<ffffffff811d6924>] aio_run_iocb+0x64/0x170 [<ffffffff811d7d51>] do_io_submit+0x291/0x920 [<ffffffff811d83f0>] sys_io_submit+0x10/0x20 [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b INFO: task fio:2725 blocked for more than 120 seconds. Not tainted 2.6.32-431.el6.x86_64 #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. fio D 0000000000000003 0 2725 2654 0x00000080 ffff88043be07698 0000000000000082 ffff88043be07628 ffffffff81058d53 ffff88043be07648 ffff880230e49cc0 ffff8802389aa228 ffff88043b2ad1b8 ffff88043a1825f8 ffff88043be07fd8 000000000000fbc8 ffff88043a1825f8 Call Trace: [<ffffffff81058d53>] ? __wake_up+0x53/0x70 [<ffffffffa030334b>] ? md_raid5_unplug_device+0x7b/0x100 [raid456] [<ffffffffa0304146>] get_active_stripe+0x236/0x830 [raid456] [<ffffffff81065df0>] ? default_wake_function+0x0/0x20 [<ffffffff8109b5ce>] ? prepare_to_wait+0x4e/0x80 [<ffffffffa0308e15>] make_request+0x1b5/0xc6c [raid456] [<ffffffff8109b2a0>] ? autoremove_wake_function+0x0/0x40 [<ffffffff811220e5>] ? mempool_alloc_slab+0x15/0x20 [<ffffffff81415b41>] md_make_request+0xe1/0x230 [<ffffffff811c3fd2>] ? bvec_alloc_bs+0x62/0x110 [<ffffffff811c32f0>] ? __bio_add_page+0x110/0x230 [<ffffffff81266c50>] generic_make_request+0x240/0x5a0 [<ffffffff811c742c>] ? do_direct_IO+0x57c/0xfa0 [<ffffffff81267020>] submit_bio+0x70/0x120 [<ffffffff811c8acd>] __blockdev_direct_IO_newtrunc+0xc7d/0x1270 [<ffffffff811c4330>] ? blkdev_get_block+0x0/0x20 [<ffffffff811c9137>] __blockdev_direct_IO+0x77/0xe0 [<ffffffff811c4330>] ? blkdev_get_block+0x0/0x20 [<ffffffff811c53b7>] blkdev_direct_IO+0x57/0x60 [<ffffffff811c4330>] ? blkdev_get_block+0x0/0x20 [<ffffffff81120552>] generic_file_direct_write+0xc2/0x190 [<ffffffff81121e71>] __generic_file_aio_write+0x3a1/0x490 [<ffffffff811d64c0>] ? aio_read_evt+0xa0/0x170 [<ffffffff811c490c>] blkdev_aio_write+0x3c/0xa0 [<ffffffff811c48d0>] ? blkdev_aio_write+0x0/0xa0 [<ffffffff811d4f64>] aio_rw_vect_retry+0x84/0x200 [<ffffffff811d6924>] aio_run_iocb+0x64/0x170 [<ffffffff811d7d51>] do_io_submit+0x291/0x920 [<ffffffff811d83f0>] sys_io_submit+0x10/0x20 [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b [root@root ~]# cat /proc/2690/stack [<ffffffff810686da>] __cond_resched+0x2a/0x40 [<ffffffffa030361c>] ops_run_io+0x2c/0x920 [raid456] [<ffffffffa03052cc>] handle_stripe+0x9cc/0x2980 [raid456] [<ffffffffa03078a4>] raid5d+0x624/0x850 [raid456] [<ffffffff81416f05>] md_thread+0x115/0x150 [<ffffffff8109aef6>] kthread+0x96/0xa0 [<ffffffff8100c20a>] child_rip+0xa/0x20 [<ffffffffffffffff>] 0xffffffffffffffff [root@root ~]# cat /proc/2690/stat 2690 (md0_raid5) R 2 0 0 0 -1 2149613632 0 0 0 0 0 68495 0 0 20 0 1 0 350990 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483391 256 0 0 0 17 2 0 0 6855 0 0 [root@root ~]# cat /proc/2690/statm 0 0 0 0 0 0 0 [root@root ~]# cat /proc/2690/stat stat statm status [root@root ~]# cat /proc/2690/status Name: md0_raid5 State: R (running) Tgid: 2690 Pid: 2690 PPid: 2 TracerPid: 0 Uid: 0 0 0 0 Gid: 0 0 0 0 Utrace: 0 FDSize: 64 Groups: Threads: 1 SigQ: 2/128402 SigPnd: 0000000000000000 ShdPnd: 0000000000000000 SigBlk: 0000000000000000 SigIgn: fffffffffffffeff SigCgt: 0000000000000100 CapInh: 0000000000000000 CapPrm: ffffffffffffffff CapEff: fffffffffffffeff CapBnd: ffffffffffffffff Cpus_allowed: ffffff Cpus_allowed_list: 0-23 Mems_allowed: 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000003 Mems_allowed_list: 0-1 voluntary_ctxt_switches: 5411612 nonvoluntary_ctxt_switches: 257032 Thanks, Manibalan. ��.n��������+%������w��{.n�����{����w��ܨ}���Ơz�j:+v�����w����ޙ��&�)ߡ�a����z�ޗ���ݢj��w�f