On Sat, 31 Jan 2009 10:38:22 +0000, "David Greaves" <david@xxxxxxxxxxxx> said: > whollygoat@xxxxxxxxxxxxxxx wrote: > > On a boot a couple of days ago, mdadm failed a disk and > > started resyncing to spare (raid5, 6 drives, 5 active, 1 > > spare). smartctl -H <disk> returned info (can't remember > > the exact text) that made me suspect the drive was > > fine, but the data connection was bad. Sure enough the > > data cable was damaged. Replaced the cable and smartctl > > sees the disk just fine and reports no errors. > > > > - I'd like to readd the drive as a spare. Is it enough > > to "mdadm --add /dev/hdk" or do I need to prep the drive to > > remove any data that said where it previously belonged > > in the array? > That should work. > Any issues and you can zero the superblock (man mdadm) > No need to zero the disk. Would --re-add be better? I've noticed something else since I made the initial post --------- begin output ------------- fly:~# mdadm -D /dev/md0 /dev/md0: Version : 01.00.03 Creation Time : Sun Jan 11 21:49:36 2009 Raid Level : raid5 Array Size : 312602368 (298.12 GiB 320.10 GB) Device Size : 156301184 (74.53 GiB 80.03 GB) Raid Devices : 5 Total Devices : 5 Preferred Minor : 0 Persistence : Superblock is persistent Intent Bitmap : Internal Update Time : Fri Jan 30 15:52:01 2009 State : active Active Devices : 5 Working Devices : 5 Failed Devices : 0 Spare Devices : 0 Layout : left-symmetric Chunk Size : 64K Name : fly:FlyFileServ_md (local to host fly) UUID : 0e2b9157:a58edc1d:213a220f:68a555c9 Events : 16 Number Major Minor RaidDevice State 0 33 1 0 active sync /dev/hde1 1 34 1 1 active sync /dev/hdg1 2 56 1 2 active sync /dev/hdi1 5 89 1 3 active sync /dev/hdo1 6 88 1 4 active sync /dev/hdm1 fly:~# mdadm -E /dev/hdo1 /dev/hdo1: Magic : a92b4efc Version : 01 Feature Map : 0x1 Array UUID : 0e2b9157:a58edc1d:213a220f:68a555c9 Name : fly:FlyFileServ_md (local to host fly) Creation Time : Sun Jan 11 21:49:36 2009 Raid Level : raid5 Raid Devices : 5 Device Size : 234436336 (111.79 GiB 120.03 GB) Array Size : 625204736 (298.12 GiB 320.10 GB) Used Size : 156301184 (74.53 GiB 80.03 GB) Super Offset : 234436464 sectors State : clean Device UUID : e072bd09:2df53d6d:d23321cc:cf2c37de Internal Bitmap : 2 sectors from superblock Update Time : Fri Jan 30 15:52:01 2009 Checksum : 4689ff5 - correct Events : 16 Layout : left-symmetric Chunk Size : 64K Array Slot : 5 (0, 1, 2, failed, failed, 3, 4) Array State : uuuUu 2 failed --------- end output ------------- Why does the "Array Slot" field show 7 slots? And why does the field "Array State" show 2 failed? There ever only were 6 disks in the array. Only one of those is currently missing. mdadm -D above doesn't list any failed devices in the "Failed Devices" field. Thanks for your answers below as well. It's kind of what I was expecting. There was a h/w problem that took ages to track down and I think it was reponsible for all the e2fs errors. WG > > > - When I tried to list some files on one of the filesystems > > on the array (the fact that it took so long to react to > > the ls is how I discovered the box was in the middle of > > rebuiling to spare) > This is OK - resync involves a lot of IO and can slow things down. This > is tuneable. > > > it couldn't find the file (or many > > others). I thought that resyncing was supposed to be > > transparent, yet parts of the fs seemed to be missing. > > Everything was there afterwards. Is that normal? > No. This is nothing to do with normal md resyncing and certainly not > expected. > > > - On a subsequent boot I had to run e2fsck on the three > > filesystems housed on the array. Many stray blocks, > > illegal inodes, etc were found. An artifact of the rebuild > > or unrelated? > Well, you had a fault in your IO system there's a good chance your O > broke. > > Verify against a backup. > > David > > > -- > "Don't worry, you'll be fine; I saw it work in a cartoon once..." -- whollygoat@xxxxxxxxxxxxxxx -- http://www.fastmail.fm - IMAP accessible web-mail -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html