Recently a drive failed on one of our file servers. The machine has three RAID6 arrays (15 1TB each plus spares). I let the spare rebuild and then started the process of replacing the drive. Unfortunately I'd misplaced the list of drive IDs so I generated a new list in order to identify the failed drive. I used "smartctl" and made a quick script to scan all 48 drives and generate pretty output. That was a mistake. After running it a couple times one of the controllers failed and several disks in the first array were failed. I worked on the machine for awhile. (It has an NFS root.) I got some information from it before it rebooted (via watchdog). I've dumped all of the information here. http://lairds.us/temp/ucmeng_md/ In mdstat_0 you can see the status of the arrays right after the controller failure. mdstat_1 shows the status after reboot. sys_block shows a listing of the block devices so you can see that the problem drives are on controller 1. The examine_sd?1 files show -E output from each drive in md0. Note that the Events count is different for the drives on the problem controller. I'd like to know if this is something I can recover. I do have backups but it's a huge pain to recover this much data. Thank you. --kyler -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html