>>> On Mon, 31 Mar 2008 10:27:46 -0700, Robert L Mathews >>> <lists@xxxxxxxxxxxxx> said: [ ... ] > I do see that both disks are under "ide:1". Is that what you > mean? Indeed the symptoms reported are likely to be from drives on the same channel. >> This is not something from mdadm, anyway. Once the disk "dies" >> you are losing the disk bus, and that is "all she wrote". That happens when the disk dies badly, but it is common enough. > So mdadm can't protect against disk failures on these machines? You can expect the Linux IO and RAID subsystems to only handle reported, clean errors, after which the state of the whole machine is well defined and known. If you have high availability requirements perhaps you should buy from an established storage vendor a storage system designed by integration engineers and guaranteed by the vendor for some high availability level. > Whenever a disk returns a write error, the machine will lock > up? Perhaps without realizing it you have engaged in storage system design and integration and there are many, many, many, many subtle pitfalls in that (as the archives of this list show abundantly). You cannot just slap things together and it all works. Have you done even sketchy common mode failure analysis? Also putting two drives belonging to a RAID set on the same IDE/ATA channel is usually a bad idea for performance too. -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html