Hi, I bought two new hard drives to expand my raid array today and unfortunately one of them appears to be bad. The problem didn't arise until after I attempted to grow the raid array. I was trying to expand the array from 6 to 8 drives. I added both drives using mdadm --add /dev/md1 /dev/sdb1 which completed, then mdadm --add /dev/md1 /dev/sdc1 which also completed. I then ran mdadm --grow /dev/md1 --raid-devices=8. It passed the critical section, then began the grow process. After a few minutes I started to hear unusual sounds from within the case. Fearing the worst I tried to cat /proc/mdstat which resulted in no output so I checked dmesg which showed that /dev/sdb1 was not working correctly. After several minutes dmesg indicated that mdadm gave up and the grow process stopped. After googling around I tried the solutions that seemed most likely to work, including removing the new drives with mdadm --remove --force /dev/md1 /dev/sd[bc]1 and rebooting after which I ran mdadm -Af /dev/md1. The grow process restarted then failed almost immediately. Trying to mount the drive gives me a reiserfs replay failure and suggests running fsck. I don't dare fsck the array since I've already messed it up so badly. Is there any way to go back to the original working 6 disc configuration with minimal data loss? Here's where I'm at right now, please let me know if I need to include any additional information. # uname -a Linux nas 2.6.22-gentoo-r5 #1 SMP Thu Aug 23 16:59:47 MDT 2007 x86_64 AMD Athlon(tm) 64 Processor 3500+ AuthenticAMD GNU/Linux # mdadm --version mdadm - v2.6.2 - 21st May 2007 # cat /proc/mdstat Personalities : [raid6] [raid5] [raid4] md1 : active raid5 hdb1[0] sdb1[8](F) sda1[5] sdf1[4] sde1[3] sdg1[2] sdd1[1] 1220979520 blocks super 0.91 level 5, 64k chunk, algorithm 2 [8/6] [UUUUUU__] unused devices: <none> # mdadm --detail --verbose /dev/md1 /dev/md1: Version : 00.91.03 Creation Time : Sun Apr 8 19:48:01 2007 Raid Level : raid5 Array Size : 1220979520 (1164.42 GiB 1250.28 GB) Used Dev Size : 244195904 (232.88 GiB 250.06 GB) Raid Devices : 8 Total Devices : 7 Preferred Minor : 1 Persistence : Superblock is persistent Update Time : Mon Oct 29 00:53:21 2007 State : clean, degraded Active Devices : 6 Working Devices : 6 Failed Devices : 1 Spare Devices : 0 Layout : left-symmetric Chunk Size : 64K Delta Devices : 2, (6->8) UUID : 56e7724e:9a5d0949:ff52889f:ac229049 Events : 0.487460 Number Major Minor RaidDevice State 0 3 65 0 active sync /dev/hdb1 1 8 49 1 active sync /dev/sdd1 2 8 97 2 active sync /dev/sdg1 3 8 65 3 active sync /dev/sde1 4 8 81 4 active sync /dev/sdf1 5 8 1 5 active sync /dev/sda1 6 0 0 6 removed 8 8 17 7 faulty spare rebuilding /dev/sdb1 #dmesg <snip> md: md1 stopped. md: unbind<hdb1> md: export_rdev(hdb1) md: unbind<sdc1> md: export_rdev(sdc1) md: unbind<sdb1> md: export_rdev(sdb1) md: unbind<sda1> md: export_rdev(sda1) md: unbind<sdf1> md: export_rdev(sdf1) md: unbind<sde1> md: export_rdev(sde1) md: unbind<sdg1> md: export_rdev(sdg1) md: unbind<sdd1> md: export_rdev(sdd1) md: bind<sdd1> md: bind<sdg1> md: bind<sde1> md: bind<sdf1> md: bind<sda1> md: bind<sdb1> md: bind<sdc1> md: bind<hdb1> md: md1 stopped. md: unbind<hdb1> md: export_rdev(hdb1) md: unbind<sdc1> md: export_rdev(sdc1) md: unbind<sdb1> md: export_rdev(sdb1) md: unbind<sda1> md: export_rdev(sda1) md: unbind<sdf1> md: export_rdev(sdf1) md: unbind<sde1> md: export_rdev(sde1) md: unbind<sdg1> md: export_rdev(sdg1) md: unbind<sdd1> md: export_rdev(sdd1) md: bind<sdd1> md: bind<sdg1> md: bind<sde1> md: bind<sdf1> md: bind<sda1> md: bind<sdb1> md: bind<sdc1> md: bind<hdb1> md: kicking non-fresh sdc1 from array! md: unbind<sdc1> md: export_rdev(sdc1) raid5: reshape will continue raid5: device hdb1 operational as raid disk 0 raid5: device sdb1 operational as raid disk 7 raid5: device sda1 operational as raid disk 5 raid5: device sdf1 operational as raid disk 4 raid5: device sde1 operational as raid disk 3 raid5: device sdg1 operational as raid disk 2 raid5: device sdd1 operational as raid disk 1 raid5: allocated 8462kB for md1 raid5: raid level 5 set md1 active with 7 out of 8 devices, algorithm 2 RAID5 conf printout: --- rd:8 wd:7 disk 0, o:1, dev:hdb1 disk 1, o:1, dev:sdd1 disk 2, o:1, dev:sdg1 disk 3, o:1, dev:sde1 disk 4, o:1, dev:sdf1 disk 5, o:1, dev:sda1 disk 7, o:1, dev:sdb1 ...ok start reshape thread md: reshape of RAID array md1 md: minimum _guaranteed_ speed: 1000 KB/sec/disk. md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for reshape. md: using 128k window, over a total of 244195904 blocks. ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen ata2.00: cmd 35/00:00:3f:42:02/00:04:00:00:00/e0 tag 0 cdb 0x0 data 524288 out res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) ata2: port is slow to respond, please be patient (Status 0xd8) ata2: device not ready (errno=-16), forcing hardreset ata2: hard resetting port <repeats 4 more times> ata2: reset failed, giving up ata2.00: disabled ata2: EH complete sd 1:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK end_request: I/O error, dev sdb, sector 148031 raid5: Disk failure on sdb1, disabling device. Operation continuing on 6 devices sd 1:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK end_request: I/O error, dev sdb, sector 149055 sd 1:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK end_request: I/O error, dev sdb, sector 149439 md: md1: reshape done. - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html