Volker Lanz wrote:
On Montag, 16. März 2009 16:06:46 Jeff Garzik wrote:
Your hardware is telling you that it is seeing CRC errors on the ATA bus
-- which means to say, a hardware problem that causes data transfer over
the SATA cable to fail.
Possible sources of problems: bad SATA cable, bad power supply, bad
SATA port on device or mainboard, bad mainboard, ...
Faulty hardware is of course always possible. It's hard to imagine something
is wrong with the hardware in this case, though:
* Both drives are affected, a Samsung and a Seagate. We can rule out the
drives.
* Both drives worked in AHCI mode with the same cables and power supply with
the ICH9 based Gigabyte board: We can rule out the drives themselves again and
also the cables.
* Different SATA cables did not help. We can rule out the cables again.
* Unplugging everything in the machine except the drive with the root
partition and the video card did not help: We can rule out the power supply (a
high quality Tagan 400W model, not some OEM junk) with a near 100%
probability, I suppose.
So, all that's left is the mainboard. But on this same machine, AHCI in
Microsoft Windows XP works without any problem whatsoever. I know the old
I would double-check that you are using an AHCI driver in XP, since that
is uncommon. Most XP drivers for AHCI-capable hardware program the SATA
device in legacy IDE mode.
Linux-stresses-your-hardware-more-than-Windows tale, but I haven't seen proof
of that for 10 years now. This finally seems to rule out the mainboard and at
least to me it appears the software side is all that is left.
What can I do to diagnose this further short of buying an additional power
supply and mainboard?
Well, the root cause of these errors are CRC errors during transmission
over the SATA cable. You can extrapolate from there... it could be
heat, poor cable shielding, dirty A/C power, bad RAM, who knows.
Your AHCI chip is telling Linux "hardware CRC error, that I could not
recover from" and Linux is dutifully reporting that back to you.
Software is unlikely as the cause, considering the volume of AHCI chips
in users hands versus the volume of bug reports like yours.
Every problem is different, of course, but the standard recommendations are
- trying different SATA ports on the mainboard. SATA ports are easy to
break or "fuzz" into uselessness
- replacing cables
- replacing the power supply
- running memtest86 or replacing RAM, etc.
- disabling AHCI mode
Mainboards can definitely go bad, but the above tends to be of more help.
Jeff
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html