Dear list, [The following was originally posted to LKML, and I've been told this list was a more appropriate place for this kind of report. Apologies.] I'm trying to reshape a 3-disk RAID5 array to a 4-disk RAID6 array (of the same total size and per-device size) using linux kernel 4.9.237 on x86_64. I understand that this reshaping operation is supposed to be supported. But it appears perpetually stuck at 0% with no operation taking place whatsoever (the slices are unchanged apart from their metadata, the backup file contains only zeroes, and nothing happens). I wonder if this is a know kernel bug, or what else could explain it, and I have no idea how to debug this sort of thing. Here are some details on exactly what I've been doing. I'll be using loopbacks to illustrate, but I've done this on real partitions and there was no difference. ## Create some empty loop devices: for i in 0 1 2 3 ; do dd if=/dev/zero of=test-${i} bs=1024k count=16 ; done for i in 0 1 2 3 ; do losetup /dev/loop${i} test-${i} ; done ## Make a RAID array out of the first three: mdadm --create /dev/md/test --level=raid5 --chunk=256 --name=test \ --metadata=1.0 --raid-devices=3 /dev/loop{0,1,2} ## Populate it with some content, just to see what's going on: for i in $(seq 0 63) ; do printf "This is chunk %d (0x%x).\n" $i $i \ | dd of=/dev/md/test bs=256k seek=$i ; done ## Now try to reshape the array from 3-way RAID5 to 4-way RAID6: mdadm --manage /dev/md/test --add-spare /dev/loop3 mdadm --grow /dev/md/test --level=6 --raid-devices=4 \ --backup-file=test-reshape.backup ...and then nothing happens. /proc/mdstat reports no progress whatsoever: md112 : active raid6 loop3[4] loop2[3] loop1[1] loop0[0] 32256 blocks super 1.0 level 6, 256k chunk, algorithm 18 [4/3] [UUU_] [>....................] reshape = 0.0% (1/16128) finish=1.0min speed=244K/sec The loop file contents are unchanged except for the metadata superblock, the backup file is entirely empty, and no activity whatsoever is happening. Actually, further investigation shows that the array is in fact operational as a RAID6 array, but one where the Q-syndrome is stuck in the last device: writing data to the md device (e.g., by repopulating it with the same command as above) does cause loop3 to be updated as expected for such a layout. It's just the reshaping which doesn't take place (or indeed begin). For completeness, here's what mdadm --detail /dev/md/test looks like before the reshape, in my example: /dev/md/test: Version : 1.0 Creation Time : Wed Sep 30 02:42:30 2020 Raid Level : raid5 Array Size : 32256 (31.50 MiB 33.03 MB) Used Dev Size : 16128 (15.75 MiB 16.52 MB) Raid Devices : 3 Total Devices : 4 Persistence : Superblock is persistent Update Time : Wed Sep 30 02:44:21 2020 State : clean Active Devices : 3 Working Devices : 4 Failed Devices : 0 Spare Devices : 1 Layout : left-symmetric Chunk Size : 256K Name : vega.stars:test (local to host vega.stars) UUID : 30f40e34:b9a52ff0:75c8b063:77234832 Events : 20 Number Major Minor RaidDevice State 0 7 0 0 active sync /dev/loop0 1 7 1 1 active sync /dev/loop1 3 7 2 2 active sync /dev/loop2 4 7 3 - spare /dev/loop3 - and here's what it looks like after the attempted reshape has started (or rather, refused to start): /dev/md/test: Version : 1.0 Creation Time : Wed Sep 30 02:42:30 2020 Raid Level : raid6 Array Size : 32256 (31.50 MiB 33.03 MB) Used Dev Size : 16128 (15.75 MiB 16.52 MB) Raid Devices : 4 Total Devices : 4 Persistence : Superblock is persistent Update Time : Wed Sep 30 02:44:54 2020 State : clean, degraded, reshaping Active Devices : 3 Working Devices : 4 Failed Devices : 0 Spare Devices : 1 Layout : left-symmetric-6 Chunk Size : 256K Reshape Status : 0% complete New Layout : left-symmetric Name : vega.stars:test (local to host vega.stars) UUID : 30f40e34:b9a52ff0:75c8b063:77234832 Events : 22 Number Major Minor RaidDevice State 0 7 0 0 active sync /dev/loop0 1 7 1 1 active sync /dev/loop1 3 7 2 2 active sync /dev/loop2 4 7 3 3 spare rebuilding /dev/loop3 I also tried writing "frozen" and then "resync" to the /sys/block/md112/md/sync_action file with no further results. I welcome any suggestions on how to investigate, work around, or fix this problem. Happy hacking, -- David A. Madore ( http://www.madore.org/~david/ )