[ ... ] >>> Personalities : [raid1] >>> md0 : active raid1 sda1[0] sdc2[2] sdb2[1] >>> 96256 blocks [3/3] [UUU] >>> >>> md1 : active raid1 sda3[0] sdc4[2] sdb4[1] >>> 730523648 blocks [3/3] [UUU] >>> [>....................] resync = 0.4% (3382400/730523648) finish=14164.9min speed=855K/sec >>> I see my array is reconstructing, but I can't tell which >>> disk failed. [ ... ] The system is currently sluggish and >>> the load is 13 [ ... ] If your kernel is one that puts IO wait in the load average that's expected if there is heavy IO load that makes resync slow. >> A more recent check show speed continuing to rise; [ ... ] Perhaps because the 'fsck' ended, as the speed issue is likely to have been been a long 'fsck', consequent to an abrupt shutdown: >> [ ... ] The resulting shutdown (which was a manual power >> off) leaves the arrays and their components in a funky state. >> When the system comes back, it fixes things up. [ ... ] Plus the poor alignment of the 'sda' partitions cutting write rates very significantly. Your 'sd[bc]' disks instead are GPT partitioned and that is by default 1MiB aligned, but you probably used some very old tool and 'sd[bc]4' are 1KiB aligned: $ factor 6835938 6835938: 2 3 17 29 2311 Someone else has pointed out the large difference in partition sizes among 'sda' vs. 'sd[bc]'; while that does not cause speed issue, the RAID set will just reduce to the multiple of the smallest size. Indeed it is reported as 730m blocks, which is the equivalent of 1461047490s reported by 'fdisk' for 'sda3'. Probably you should have a 2-disk RAID1 of 'sd[bc]' alone. >> Even if this did happen, in RAID 1 wouldn't some of the >> componnents (partitions in my case) be deemed good and others >> bad, with the latter resynced to match the former? And if >> that is happening, why can't I tell which partition(s) are >> master (considered good) and which are not Because you haven't read some relevant documentation... >> (being overwritten with contents of the master)? Two ways, for example: * The "event counts" reported by will be different (higher event count means more recent). * 'iostat' will tell you which drives are being read and which written. > I checked the logs and didn't see anything about a drive > failing, though there were some smartd reports of changes in > drive parameters like temperature. The kernel logs always tell if a resync is triggered by a failure, but note that a resync happens on a failure when a spare is added to the RAID set to replace the failed drive, or when the drives are out of sync because of an abrupt shutdown, which seems to be your case. Anyhow the ways to look at the health of the disk suggested by others are somewhat misleading. The first thing is to have a mental model of possible disk failure modes... Anyhow, the most relevant data are in 'smartctl -A' the number of reallocated sectors (too many indicates a failing disk) and the SMART selftest and error logs, to check the frequency of issues. -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html