199 UDMA_CRC_Error_Count 0x0008 002 001 000 Old_age line - 798
Denys, I did a smartctl check on 12 disks in 5 differrent servers (usually pairs of disks sw mirrorred), and all of those - except one - had 0-s at UDMA_CRC_Error_Count. Only one had 16 in it, this one is SAMSUNG HD300LD installed at 2006.09.12, running 24/7, so its uptime is about 13200 hours. However, it's pair (same disk, same uptime) have 0, which makes me think, that it is not motherboard/controller, but cable or HDD. (wiki says, it is: "The number of errors in data transfer via the interface cable as determined by ICRC (Interface Cyclic Redundancy Check)." http://en.wikipedia.org/wiki/Self-Monitoring,_Analysis,_and_Reporting_Technology ) Is that value is growing at your server? + Related to my original issue (exception / hard resetting link), which later Denys also experienced and countinued on this thread, my current status is, that 1) I received mail from other guy, he wrote:
I have a similar problem with an N680SLI, as posted here: http://forums.gentoo.org/viewtopic-t-641372-highlight-.html Short version - 2.6.22 seems stable, anything later, unstable.
Since exhibiting the problem takes days, weeks or even months, he can't know more, promised to write to list if he finds out anything. 2) I replaced the MB to a different one, now it is a Gigabyte as well, but it has no nvidia/jmicron contollers but ata_piix and achi onboard, and - ironically - an addon sil24 card... So far, the system running well [knock-knock], under heavy stress test, for 3 weeks now, without problems. I believe Tejun suggested to try to remove one of the HDD-s online to see what happens, I will try this today later on, when I am at the server and let you know. (for those who need refreshment, my initial thread was on http://www.mail-archive.com/linux-ide@xxxxxxxxxxxxxxx/msg15950.html and it continued on http://www.opensubscriber.com/message/linux-ide@xxxxxxxxxxxxxxx/8633679.html my latest mail on this topic is at: http://www.opensubscriber.com/message/linux-ide@xxxxxxxxxxxxxxx/8718520.html )G. -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html