>>>>> "Peter" == Peter Bates <peter.thebates@xxxxxxxxx> writes: Peter> I have a 3 disk RAID 5 array that I tried to add a 4th disk to. >> mdadm --add /dev/md6 /dev/sdb1 >> mdadm --grow --raid-devices=4 /dev/md6 Peter> This operation started successfully and proceeded until it hit 51.1% >> cat /proc/mdstat Peter> Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] Peter> [raid4] [multipath] [faulty] Peter> md6 : active raid5 sda1[0] sdb1[5] sdf1[3] sde1[4] Peter> 3906764800 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [UUUU] Peter> [==========>..........] reshape = 51.1% (998533632/1953382400) Peter> finish=9046506.1min speed=1K/sec Peter> bitmap: 0/15 pages [0KB], 65536KB chunk Peter> It has been sitting on the same 998533632 position for Peter> days. I've tried a few reboots, but it never progresses. Peter> Stopping the array, or trying to start the logical volume in it Peter> hangs. Altering the min / max speed parameters has no effect. Peter> When I reboot and resemble the array the speed indicated Peter> steadily drops to almost 0. >> mdadm --assemble /dev/md6 --verbose --uuid 90c2b5c3:3bbfa0d7:a5efaeed:726c43e2 I looked back in my email archives, and I wonder if maybe you have SElinux enabled? If so, please turn it off and see if that helps. What happens when you use dd on each of the drives and dump the output to /dev/null? Are there any messages in the logs, or dmesg output after the stuff you showed? Can you maybe 'strace' the mdadm process, or even go grab the latest version using git from: git clone git://neil.brown.name/mdadm And see if compiling it yourself from the master might do the trick. Peter> I haven't tried anything more drastic than a reboot yet, Peter> Below is as much information as I can think to provide at this stage. Peter> Please let me know what else I can do. Peter> I'm happy to change kernels, kernel config or anything else require to Peter> get better info. Peter> Kernel: 4.4.3 Peter> mdadm 3.4 >> ps aux | grep md6 Peter> root 5041 99.9 0.0 0 0 ? R 07:10 761:58 [md6_raid5] Peter> root 5042 0.0 0.0 0 0 ? D 07:10 0:00 [md6_reshape] Peter> This is consistent. 100% cpu on the raid component, but not the reshape >> mdadm --detail --verbose /dev/md6 Peter> /dev/md6: Peter> Version : 1.2 Peter> Creation Time : Fri Aug 29 21:13:52 2014 Peter> Raid Level : raid5 Peter> Array Size : 3906764800 (3725.78 GiB 4000.53 GB) Peter> Used Dev Size : 1953382400 (1862.89 GiB 2000.26 GB) Peter> Raid Devices : 4 Peter> Total Devices : 4 Peter> Persistence : Superblock is persistent Peter> Intent Bitmap : Internal Peter> Update Time : Wed Apr 27 07:10:07 2016 Peter> State : clean, reshaping Peter> Active Devices : 4 Peter> Working Devices : 4 Peter> Failed Devices : 0 Peter> Spare Devices : 0 Peter> Layout : left-symmetric Peter> Chunk Size : 512K Peter> Reshape Status : 51% complete Peter> Delta Devices : 1, (3->4) Peter> Name : Alpheus:6 (local to host Alpheus) Peter> UUID : 90c2b5c3:3bbfa0d7:a5efaeed:726c43e2 Peter> Events : 47975 Peter> Number Major Minor RaidDevice State Peter> 0 8 1 0 active sync /dev/sda1 Peter> 4 8 65 1 active sync /dev/sde1 Peter> 3 8 81 2 active sync /dev/sdf1 Peter> 5 8 17 3 active sync /dev/sdb1 >> iostat Peter> Linux 4.4.3-gentoo (Alpheus) 04/27/2016 _x86_64_ (4 CPU) Peter> avg-cpu: %user %nice %system %iowait %steal %idle Peter> 1.84 0.00 24.50 0.09 0.00 73.57 Peter> Looking at the individual disks I can see minor activity on the MD6 Peter> members. This activity tends to match up with the overall rate Peter> reported by /proc/mdstat Peter> Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn Peter> sda 0.02 2.72 1.69 128570 79957 Peter> sdb 0.01 0.03 1.69 1447 79889 Peter> sdd 3.85 2.27 56.08 106928 2646042 Peter> sde 0.02 2.73 1.69 128610 79961 Peter> sdf 0.02 2.72 1.69 128128 79961 Peter> sdc 4.08 5.44 56.08 256899 2646042 Peter> md0 2.91 7.62 55.08 359714 2598725 Peter> dm-0 0.00 0.03 0.00 1212 0 Peter> dm-1 0.00 0.05 0.00 2151 9 Peter> dm-2 2.65 6.52 3.42 307646 161296 Peter> dm-3 0.19 1.03 51.66 48377 2437420 Peter> md6 0.00 0.02 0.00 1036 0 >> dmesg Peter> [ 1199.426995] md: bind<sde1> Peter> [ 1199.427779] md: bind<sdf1> Peter> [ 1199.428379] md: bind<sdb1> Peter> [ 1199.428592] md: bind<sda1> Peter> [ 1199.429260] md/raid:md6: reshape will continue Peter> [ 1199.429274] md/raid:md6: device sda1 operational as raid disk 0 Peter> [ 1199.429275] md/raid:md6: device sdb1 operational as raid disk 3 Peter> [ 1199.429276] md/raid:md6: device sdf1 operational as raid disk 2 Peter> [ 1199.429277] md/raid:md6: device sde1 operational as raid disk 1 Peter> [ 1199.429498] md/raid:md6: allocated 4338kB Peter> [ 1199.429807] md/raid:md6: raid level 5 active with 4 out of 4 Peter> devices, algorithm 2 Peter> [ 1199.429810] RAID conf printout: Peter> [ 1199.429811] --- level:5 rd:4 wd:4 Peter> [ 1199.429812] disk 0, o:1, dev:sda1 Peter> [ 1199.429814] disk 1, o:1, dev:sde1 Peter> [ 1199.429816] disk 2, o:1, dev:sdf1 Peter> [ 1199.429817] disk 3, o:1, dev:sdb1 Peter> [ 1199.429993] created bitmap (15 pages) for device md6 Peter> [ 1199.430297] md6: bitmap initialized from disk: read 1 pages, set 0 Peter> of 29807 bits Peter> [ 1199.474604] md6: detected capacity change from 0 to 4000527155200 Peter> [ 1199.474611] md: reshape of RAID array md6 Peter> [ 1199.474613] md: minimum _guaranteed_ speed: 1000 KB/sec/disk. Peter> [ 1199.474614] md: using maximum available idle IO bandwidth (but not Peter> more than 200000 KB/sec) for reshape. Peter> [ 1199.474617] md: using 128k window, over a total of 1953382400k. >> lsblk Peter> NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT Peter> sda 8:0 0 1.8T 0 disk Peter> └─sda1 8:1 0 1.8T 0 part Peter> └─md6 9:6 0 3.7T 0 raid5 Peter> sdb 8:16 0 1.8T 0 disk Peter> └─sdb1 8:17 0 1.8T 0 part Peter> └─md6 9:6 0 3.7T 0 raid5 Peter> sdc 8:32 0 2.7T 0 disk Peter> ├─sdc1 8:33 0 16M 0 part Peter> └─sdc2 8:34 0 2.7T 0 part Peter> └─md0 9:0 0 2.7T 0 raid1 Peter> ├─vg--mirror-swap 253:0 0 4G 0 lvm [SWAP] Peter> ├─vg--mirror-boot 253:1 0 256M 0 lvm /boot Peter> ├─vg--mirror-root 253:2 0 256G 0 lvm / Peter> └─vg--mirror-data--mirror 253:3 0 2.5T 0 lvm /data/mirror Peter> sdd 8:48 0 2.7T 0 disk Peter> ├─sdd1 8:49 0 16M 0 part Peter> └─sdd2 8:50 0 2.7T 0 part Peter> └─md0 9:0 0 2.7T 0 raid1 Peter> ├─vg--mirror-swap 253:0 0 4G 0 lvm [SWAP] Peter> ├─vg--mirror-boot 253:1 0 256M 0 lvm /boot Peter> ├─vg--mirror-root 253:2 0 256G 0 lvm / Peter> └─vg--mirror-data--mirror 253:3 0 2.5T 0 lvm /data/mirror Peter> sde 8:64 0 1.8T 0 disk Peter> └─sde1 8:65 0 1.8T 0 part Peter> └─md6 9:6 0 3.7T 0 raid5 Peter> sdf 8:80 0 1.8T 0 disk Peter> └─sdf1 8:81 0 1.8T 0 part Peter> └─md6 9:6 0 3.7T 0 raid5 Peter> Thanks for any pointers Peter> Peter Bates Peter> peter.thebates@xxxxxxxxx Peter> -- Peter> To unsubscribe from this list: send the line "unsubscribe linux-raid" in Peter> the body of a message to majordomo@xxxxxxxxxxxxxxx Peter> More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html