I started about 1 year ago with a 5x2tb raid 5. At the beginning of feburary, I came home from work and my drives were all making these crazy beeping noises. At that point I was on kernel version .34 I shutdown and rebooted the server and the raid array didn't come back online. I noticed one drive was going up and down and determined that the drive had actual physical damage to the power connecter and was losing and regaining power through vibration. No problem. I bought another hard drive and mdadm started recovering to the new drive. Got it back to a Raid 5, backed up my data, then started growing to a raid6, and my computer hung hard where even REISUB was ignored. I restarted and resumed the grow. Then I started getting errors like these, they repeat for a minute or two and then the device gets failed out of the array: [ 193.801507] ata4.00: exception Emask 0x0 SAct 0x40000063 SErr 0x0 action 0x0 [ 193.801554] ata4.00: irq_stat 0x40000008 [ 193.801581] ata4.00: failed command: READ FPDMA QUEUED [ 193.801616] ata4.00: cmd 60/08:f0:98:c8:2b/00:00:10:00:00/40 tag 30 ncq 4096 in [ 193.801618] res 51/40:08:98:c8:2b/00:00:10:00:00/40 Emask 0x409 (media error) <F> [ 193.801703] ata4.00: status: { DRDY ERR } [ 193.801728] ata4.00: error: { UNC } [ 193.804479] ata4.00: configured for UDMA/133 [ 193.804499] ata4: EH complete First one one drive, then on another, then on another, as the slow grow to raid 6 was happening these messages kept coming up and taking drives down. Eventually (over the course of the week long grow time) the failures were happening faster than I could recover them and I had to revert to ddrescueing raid components to keep it from going under the minimum components. I ended up having to ddrescue 3 failed drives and force the array assembly to get back to 5 drives and by that time the arrays ext4 file system could no longer mount (said something about group descriptors being corrupted). By this time, every one of the original drives has been replaced and this has been ongoing for 5 months. I didn't even want to do an fsck to *attempt* to fix the file system until I got a solid raid6. I upgraded my kernel to .40, bought another hard drive and put it in there and started the grow. Within an hour the system froze. I rebooted and restarted the array (and the grow), 2 hours later the system froze again, rebooted restarted the array (and the grow) again, and got those same errors again, this time on a drive that I had bought last month. Frustrated (feeling like this will never end) I let it keep going, hoping to atleast get back to raid 5. A few hours later I got these errors AGAIN on ANOTHER drive I got last month (of a differen't brand and model). So now I'm back with a non functional array. A pile of 6 dead drives (not counting the ones still in the computer, components of a now incomplete array). What is going on here? If brand new drives from a month ago from two different manufacturers are failling, something else is going on. Is it my motherboard? I've run memtest for 15 hours so far with no errors, and ill let it go for 48 before I stop it, lets assume its not the RAM for now. Not included in this history are SEVERAL times the machine locked up harder than a REISUB, almost always during the heavy IO of component recovery. It seems to stay up for weeks when the array is inactive (and I'm too busy with other things to deal with it) and then as soon as I put a new drive in and the recovery starts, it hangs within an hour, and does so every few hours, and eventually I get the "failed command: READ FPDMA QUEUED status: { DRDY ERR } error: { UNC }" errors and another drive falls off the array. I don't mind buying a new motherboard if thats what it is (i've already spent almost a grand on hard drives), I just want to get this fixed/stable and the nightmare behind me. Here is the dmesg output for my last boot where two drives failed at 193 and 12196: http://paste.ubuntu.com/5753575/ Thanks for any thoughts on the matter -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html