Hello, I posted about this problem several months ago and unfortunately I never received any suggestions. I haven't been able to fix the problem on my own, so I am hoping someone here can help. I have a raid10 array that originally consisted of 6x EXOS 16TB drives connected through an LSI 9601-16e (SAS2116). Needing more space, I added 4 more 16TB drives. I used the following commands to add and grow the array: [2024-06-24 19:38:12] sudo mdadm /dev/md2 --add /dev/sd[i-l]1 [2024-06-24 19:39:27] sudo mdadm --grow /dev/md2 --raid-devices=10 After 10-11 hours of reshaping (I knew it would take a long time), the reshape seemed to freeze at 22.1% completed. In dmesg I saw the following error: [260007.679410] md: md2: reshape interrupted. [260144.852441] INFO: task md2_reshape:242508 blocked for more than 122 seconds. [260144.852459] Tainted: G OE 6.9.3-060903-generic #202405300957 [260144.852466] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [260144.852471] task:md2_reshape state:D stack:0 pid:242508 tgid:242508 ppid:2 flags:0x00004000 [260144.852484] Call Trace: [260144.852489] <TASK> [260144.852496] __schedule+0x279/0x6a0 [260144.852512] schedule+0x29/0xd0 [260144.852523] wait_barrier.part.0+0x180/0x1e0 [raid10] [260144.852544] ? __pfx_autoremove_wake_function+0x10/0x10 [260144.852560] wait_barrier+0x70/0xc0 [raid10] [260144.852577] raid10_sync_request+0x177e/0x19e3 [raid10] [260144.852595] ? __schedule+0x281/0x6a0 [260144.852605] md_do_sync+0xa36/0x1390 [260144.852615] ? __pfx_autoremove_wake_function+0x10/0x10 [260144.852628] ? __pfx_md_thread+0x10/0x10 [260144.852635] md_thread+0xa5/0x1a0 [260144.852643] ? __pfx_md_thread+0x10/0x10 [260144.852649] kthread+0xe4/0x110 [260144.852659] ? __pfx_kthread+0x10/0x10 [260144.852667] ret_from_fork+0x47/0x70 [260144.852675] ? __pfx_kthread+0x10/0x10 [260144.852683] ret_from_fork_asm+0x1a/0x30 [260144.852693] </TASK> Some other info which may be helpful: bill@bill-desk:~$ mdadm --version mdadm - v4.3 - 2024-02-15 - Ubuntu 4.3-1ubuntu2 bill@bill-desk:~$ uname -a Linux bill-desk 6.9.3-060903-generic #202405300957 SMP PREEMPT_DYNAMIC Thu May 30 11:39:13 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux bill@bill-desk:~$ sudo mdadm -D /dev/md2 /dev/md2: Version : 1.2 Creation Time : Sat Nov 20 14:29:13 2021 Raid Level : raid10 Array Size : 46877236224 (43.66 TiB 48.00 TB) Used Dev Size : 15625745408 (14.55 TiB 16.00 TB) Raid Devices : 10 Total Devices : 10 Persistence : Superblock is persistent Intent Bitmap : Internal Update Time : Tue Jun 25 10:05:18 2024 State : clean, reshaping Active Devices : 10 Working Devices : 10 Failed Devices : 0 Spare Devices : 0 Layout : near=2 Chunk Size : 512K Consistency Policy : bitmap Reshape Status : 22% complete Delta Devices : 4, (6->10) Name : bill-desk:2 (local to host bill-desk) UUID : 8a321996:5beb9c15:4c3fcf5b:6c8 b6005 Events : 77923 Number Major Minor RaidDevice State 0 8 65 0 active sync set-A /dev/sde1 1 8 81 1 active sync set-B /dev/sdf1 2 8 97 2 active sync set-A /dev/sdg1 3 8 113 3 active sync set-B /dev/sdh1 5 8 209 4 active sync set-A /dev/sdn1 4 8 193 5 active sync set-B /dev/sdm1 9 8 177 6 active sync set-A /dev/sdl1 8 8 161 7 active sync set-B /dev/sdk1 7 8 145 8 active sync set-A /dev/sdj1 6 8 129 9 active sync set-B /dev/sdi1 bill@bill-desk:~$ cat /proc/mdstat Personalities : [raid10] [raid0] [raid1] [raid6] [raid5] [raid4] md1 : active raid10 sdd1[3] sdc1[2] sdb1[1] sda1[0] 15627786240 blocks super 1.2 512K chunks 2 near-copies [4/4] [UUUU] bitmap: 0/117 pages [0KB], 65536KB chunk md2 : active raid10 sdl1[9] sdk1[8] sdj1[7] sdi1[6] sdn1[5] sdh1[3] sdf1[1] sde1[0] sdg1[2] sdm1[4] 46877236224 blocks super 1.2 512K chunks 2 near-copies [10/10] [UUUUUUUUUU] [====>................] reshape = 22.1% (10380906624/46877236224) finish=2322382.1min speed=261K/sec bitmap: 59/146 pages [236KB], 262144KB chunk unused devices: <none> In the meantime I have rebooted several times, done some system software updates, etc. Nothing has improved or fixed it. I just have no idea how to help this along. It won't finish the reshape, and i can't mount the array to copy the data off. I have enough spare disk space to copy the data to a temporary home if I could access it, but the array won't mount. Or, I don't know if it is safe to attempt to mount it. Recently one difference I've noticed is that upon rebooting, the array is called md127 instead of md2 as before. BTW, md1 is an unrelated array on the same system, and it's working fine. Upon reboot, the reshape begins, but 4 seconds later it is interrupted: [ 2.451982] md/raid10:md1: active with 4 out of 4 devices [ 2.471053] md1: detected capacity change from 0 to 31255572480 [ 2.517951] md/raid10:md127: not clean -- starting background reconstruction [ 2.517956] md/raid10:md127: active with 10 out of 10 devices [ 2.541203] md127: detected capacity change from 0 to 93754472448 [ 2.928687] raid6: sse2x4 gen() 7064 MB/s [ 2.945680] raid6: sse2x2 gen() 8468 MB/s [ 2.962690] raid6: sse2x1 gen() 6369 MB/s [ 2.962691] raid6: using algorithm sse2x2 gen() 8468 MB/s [ 2.979680] raid6: .... xor() 7029 MB/s, rmw enabled [ 2.979682] raid6: using ssse3x2 recovery algorithm [ 5.946619] EXT4-fs (md1): mounted filesystem 8f645711-4d2b-42bf-877c-a8c993923a7c r/w with ordered data mode. Quota mode: none. [ 401.676363] md: reshape of RAID array md127 [ 405.049914] md: md127: reshape interrupted. [ 615.617649] INFO: task md127_reshape:5304 blocked for more than 122 seconds. [ 615.617684] task:md127_reshape state:D stack:0 pid:5304 tgid:5304 ppid:2 flags:0x00004000 [ 615.617747] wait_barrier.part.0+0x188/0x1e0 [raid10] [ 615.617781] wait_barrier+0x70/0xc0 [raid10] [ 615.617798] raid10_sync_request+0x1545/0x183d [raid10] [ 615.617831] md_do_sync.cold+0x609/0xa1f [ 615.617862] md_thread+0xa3/0x1a0 [ 615.617875] ? __pfx_md_thread+0x10/0x10 The portion of dmesg that starts with INFO: repeats several times every two minutes or so. I really don't know what to do here. I've been struggling for a few months trying to get it back to working with no success. If the reshape isn't ultimately going to be successful, can anyone explain how to at least mount the array safely so I can copy the data off? I can provide further info as needed. Thank you, Bill Morgan