Re: kernel errors when hda is read? Help!

Matthew Galgoci <mgalgoci@xxxxxxxxxx> · Thu, 20 Nov 2003 15:41:08 -0500 (EST)

On Thu, 20 Nov 2003, Benjamin J. Weiss wrote:

> All,
> 
> I'm getting the following errors when I try to read or write some areas of
> my hda.  I've tried running badblocks, but it doesn't find anything.  I've
> tried booting from the rescue disk and running "fsck -p -c -f -v /dev/hda2"
> (one of the partitions giving me particular trouble), but it came back
> claiming to be clean.  The drive is less than a year old and shouldn't be
> having trouble, though the drive has been getting a bunch of use lately.
> 
> in my /var/log/messages, I have a lot of:
> 
> Nov 20 07:26:15 mail kernel: hda: dma_intr: error=0x40 {
> UncorrectableError }, LBAsect=6576516, sector=6367608
> Nov 20 07:26:15 mail kernel: end_request: I/O error, dev 03:02 (hda), sector
> 6367608
> Nov 20 07:26:15 mail kernel: hda: dma_intr: status=0x51 { DriveReady
> SeekComplete Error }
> Nov 20 07:26:15 mail kernel: hda: dma_intr: error=0x40 {
> UncorrectableError }, LBAsect=6576516, sector=6367616
> Nov 20 07:26:15 mail kernel: end_request: I/O error, dev 03:02 (hda), sector
> 6367616
> Nov 20 07:26:15 mail kernel: hda: dma_intr: status=0x51 { DriveReady
> SeekComplete Error }
> Nov 20 07:26:15 mail kernel: hda: dma_intr: error=0x40 {
> UncorrectableError }, LBAsect=6576516, sector=6367624
> Nov 20 07:26:15 mail kernel: end_request: I/O error, dev 03:02 (hda), sector
> 6367624
> Nov 20 07:26:15 mail kernel: hda: dma_intr: status=0x51 { DriveReady
> SeekComplete Error }
> Nov 20 07:26:15 mail kernel: hda: dma_intr: error=0x40 {
> UncorrectableError }, LBAsect=6576516, sector=6367632
> Nov 20 07:26:15 mail kernel: end_request: I/O error, dev 03:02 (hda), sector
> 6367632

That's either noise on the cable or the disk is failing. Make sure you have a backup handy
and try replacing the ide cable with something less questionable. Also, you might want to
look into using smartd from the kernel-utils package to monitor the health of this device.

I do have a 120GB/7200rpm maxtor at home that started doing this and I assumed the drive 
was bad. In my process of debugging it to determine if this was a drive health related issue
or something else I use hdparm to turn the transfer rates down to the lowest possible settings,
followed by reparitioning and reformatting, and the problem went away. The drive is now back
running at full speed but I don't trust it for data integrity anymore. I use it for scratch
space for compiling stuff.

As near as I can tell, there was some weird firmware hiccup that hdparm reset. I stress that is
only a wild guess. I haven't seen those errors since.

-- 
Matthew Galgoci
System Administrator
Red Hat, Inc
919.754.3700 x44155

-- 
redhat-list mailing list
unsubscribe mailto:redhat-list-request@xxxxxxxxxx?subject=unsubscribe
https://www.redhat.com/mailman/listinfo/redhat-list