Hi Karel, On 09/24/2013 12:28 PM, Karel Walters wrote: > Will find a way to do proper scrubbing and alter the timeouts on startup. >> for x in /sys/block/sd[d-h]/device/timeout ; do echo 180 >$x ; done > done! Good. >> { In the future, buy drives that wake up with ERC enabled (like your WD >> Reds), or at least capable of enabling ERC (at every powerup). } > Reds are on the desk next to me and will replace the raid array. Very Good. Mind you, the Seagates are good enough drives, they just aren't suited to raid arrays. Changing the driver timeouts will get you by, but when you do encounter an error, the three minute pause will kick many applications in the teeth. I have a few Seagates like this kicking around that I use for offsite backups. >> Next, you will have to figure out which of the bumped drives belongs in >> which slot in the array. An old dmesg (from before the failures) or an >> archived "mdadm --detail" would tell us that. This is important, >> because you *will* need to use --create --assume-clean as the drives are >> now marked as spare--the info needed for forced assembly is gone. > > This is a problem for me and maybe a harsh lesson, I added an old > dmesg output at the end but I' m not to sure about it. Yes, that dmesg did the trick. The drive that failed first was #3, and the drive the failed second was #4. You should create a list of which drive serial number corresponds to which raid device role, with a third column showing the current device name. Then we can construct an "mdadm --create --assume-clean" command that generates the correct order. And I would leave the partially synced spare out entirely. Then, to deal with the large number of pending events, you'll need to do a "check" scrub with a very low speed limit. To keep you from exceeding the 10/hour read error limit in the MD kernel driver. { Or you can scrub at full speed until it kicks drives out, then force assemble and restart the scrub. Many times over in your case. } Phil -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html