Hi, I'm trying to reshape a 4-disk RAID6 array by adding a fifth "missing" drive. Maybe that's a weird thing to do, so for context: I'm converting from a 3-disk RAID10, by creating a new RAID6 with the three new disks and then moving disks one at a time between the arrays. I did it this way so that I could test for problems with the reshape procedure before irrevocably modifying more than one of the original disks. (I do also have an offsite backup of the most important data, but it's inconvenient to access and I'm hoping not to need it.) Anyway, the reshape was going fine until about 70% completion, and then it got stuck. I've tried rebooting a few times: the array can be assembled in read-only mode, but as soon as it goes read-write and the reshape process continues, it gets through a few megabytes and hangs. At that point, any other process that tries to access the array also hangs uninterruptibly. Here's what shows up in dmesg: [ 721.183225] INFO: task md127_resync:1730 blocked for more than 120 seconds. [ 721.183978] Not tainted 4.0.0 #1 [ 721.184751] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 721.185514] md127_resync D ffff88042ea94440 0 1730 2 0x00000000 [ 721.185516] ffff88041a24ed20 0000000000000400 ffff88041ca82a20 0000000000000246 [ 721.185518] ffff8800b8b5ffd8 ffff8800b8b5fbf0 ffff880419035a30 0000000000000004 [ 721.185519] ffff8800b8b5fd1c ffff88040e91d000 ffffffff8155c73f ffff880419035800 [ 721.185520] Call Trace: [ 721.185526] [<ffffffff8155c73f>] ? schedule+0x2f/0x80 [ 721.185530] [<ffffffffa0888390>] ? reshape_request+0x1e0/0x8f0 [raid456] [ 721.185533] [<ffffffff810a86f0>] ? wait_woken+0x90/0x90 [ 721.185535] [<ffffffffa0888dae>] ? sync_request+0x30e/0x390 [raid456] [ 721.185547] [<ffffffffa02cbf89>] ? is_mddev_idle+0xc9/0x130 [md_mod] [ 721.185550] [<ffffffffa02cf432>] ? md_do_sync+0x802/0xd30 [md_mod] [ 721.185555] [<ffffffff8101c356>] ? native_sched_clock+0x26/0x90 [ 721.185558] [<ffffffffa02cbb30>] ? md_safemode_timeout+0x50/0x50 [md_mod] [ 721.185561] [<ffffffffa02cbc56>] ? md_thread+0x126/0x130 [md_mod] [ 721.185563] [<ffffffff8155c0c0>] ? __schedule+0x2a0/0x8f0 [ 721.185565] [<ffffffffa02cbb30>] ? md_safemode_timeout+0x50/0x50 [md_mod] [ 721.185568] [<ffffffff81089403>] ? kthread+0xd3/0xf0 [ 721.185570] [<ffffffff81089330>] ? kthread_create_on_node+0x180/0x180 [ 721.185572] [<ffffffff81560598>] ? ret_from_fork+0x58/0x90 [ 721.185574] [<ffffffff81089330>] ? kthread_create_on_node+0x180/0x180 And the output of mdadm --detail/-E: https://gist.github.com/anonymous/0b090668b56ef54bb2f0 I was originally running a Debian 3.16.0 kernel, and then upgraded to 4.0 to see if it would help, but no such luck. Does anyone have any suggestions? Since the data on the array seems to be fine, hopefully there's a solution that doesn't involve re-creating it from scratch and restoring from backups. Thanks, -- David -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html