The more I think about it, the more it sounds like a faulty hard disk. I'll just RMA it. Sorry for the trouble. Wesley On 4/1/06, Wesley Lim <wesleylim@xxxxxxxxx> wrote: > On 4/1/06, David Greaves <david@xxxxxxxxxxxx> wrote: > > I take it badblocks shows the drives as OK? (mine does) > > smartctl says that SMART overall-health self-assessment is passed. > > I've tinkered around with badblocks and the results are surprising. I > really hope this is not a case of bad hard disk that intermittently > decides to act up. > > badblocks gives the same errors as when copying files into the drive. > *However* badblocks reports 0 bad sectors and no libata error if I > test the exact same sectors it reported were bad *immediately after > the previous (errored) run* (no cooldown/reboot/etc) It might have to > do with length of SATA bus activity? I get both libata and badblocks > errors around the 60000000 sector range after badblocks has been > running ~1 hour into a test. if I just run badblocks over those > reported blocks, it always shows up fine, no libata or badblocks error > (including badblocks -w) > > I experience the errors only on the reading phase of badblocks, not > the writing phase. Also, as said above, I get zero errors when I > repeat badblocks -w over any specific "problem" areas. This cycle > repeats itself. (if I run a full drive scan, similar errors occur. if > I check just those "bad" blocks, they disappear) > > I'm going to try starting a long block scan from a different address n > and see if the reported "bad" blocks show up around the n+60000000 > sector range. > > > Mark published *two* patches to provide scsi op/cmd info which you may > > want to apply. (25/2/06 and 14/2/06) > > (They don't quite apply cleanly - IIRC Mark has a DPRINTK macro that > > needs changing to a printk call) > > I just patched my kernel. The new error messages are below. > > > By the way, 2.6.16 is not kicking the drives from the array for me. > > Did you mean 2.6.15.1? (just checking) > > Yup, 2.6.16.1 > > > Here's the new dmesg (running badblocks) > > ata1: no sense translation for op=0x28 cmd=0x25 status: 0x51 > ata1: translated op=0x28 cmd=0x25 ATA stat/err 0x51/00 to SCSI > SK/ASC/ASCQ 0x3/11/04 > ata1: status=0x51 { DriveReady SeekComplete Error } > ata1: no sense translation for op=0x28 cmd=0x25 status: 0x51 > ata1: translated op=0x28 cmd=0x25 ATA stat/err 0x51/00 to SCSI > SK/ASC/ASCQ 0x3/11/04 > ata1: status=0x51 { DriveReady SeekComplete Error } > ata1: no sense translation for op=0x28 cmd=0x25 status: 0x51 > ata1: translated op=0x28 cmd=0x25 ATA stat/err 0x51/00 to SCSI > SK/ASC/ASCQ 0x3/11/04 > ata1: status=0x51 { DriveReady SeekComplete Error } > ata1: no sense translation for op=0x28 cmd=0x25 status: 0x51 > ata1: translated op=0x28 cmd=0x25 ATA stat/err 0x51/00 to SCSI > SK/ASC/ASCQ 0x3/11/04 > ata1: status=0x51 { DriveReady SeekComplete Error } > ata1: no sense translation for op=0x28 cmd=0x25 status: 0x51 > ata1: translated op=0x28 cmd=0x25 ATA stat/err 0x51/00 to SCSI > SK/ASC/ASCQ 0x3/11/04 > ata1: status=0x51 { DriveReady SeekComplete Error } > ata1: no sense translation for op=0x28 cmd=0x25 status: 0x51 > ata1: translated op=0x28 cmd=0x25 ATA stat/err 0x51/00 to SCSI > SK/ASC/ASCQ 0x3/11/04 > ata1: status=0x51 { DriveReady SeekComplete Error } > sd 0:0:0:0: SCSI error: return code = 0x8000002 > sda: Current: sense key: Medium Error > Additional sense: Unrecovered read error - auto reallocate failed > end_request: I/O error, dev sda, sector 61590952 > ata1: no sense translation for op=0x28 cmd=0x25 status: 0x51 > ata1: translated op=0x28 cmd=0x25 ATA stat/err 0x51/00 to SCSI > SK/ASC/ASCQ 0x3/11/04 > ata1: status=0x51 { DriveReady SeekComplete Error } > ata1: no sense translation for op=0x28 cmd=0x25 status: 0x51 > ata1: translated op=0x28 cmd=0x25 ATA stat/err 0x51/00 to SCSI > SK/ASC/ASCQ 0x3/11/04 > ata1: status=0x51 { DriveReady SeekComplete Error } > ata1: no sense translation for op=0x28 cmd=0x25 status: 0x51 > ata1: translated op=0x28 cmd=0x25 ATA stat/err 0x51/00 to SCSI > SK/ASC/ASCQ 0x3/11/04 > ata1: status=0x51 { DriveReady SeekComplete Error } > ata1: no sense translation for op=0x28 cmd=0x25 status: 0x51 > ata1: translated op=0x28 cmd=0x25 ATA stat/err 0x51/00 to SCSI > SK/ASC/ASCQ 0x3/11/04 > ata1: status=0x51 { DriveReady SeekComplete Error } > ata1: no sense translation for op=0x28 cmd=0x25 status: 0x51 > ata1: translated op=0x28 cmd=0x25 ATA stat/err 0x51/00 to SCSI > SK/ASC/ASCQ 0x3/11/04 > ata1: status=0x51 { DriveReady SeekComplete Error } > ata1: no sense translation for op=0x28 cmd=0x25 status: 0x51 > ata1: translated op=0x28 cmd=0x25 ATA stat/err 0x51/00 to SCSI > SK/ASC/ASCQ 0x3/11/04 > ata1: status=0x51 { DriveReady SeekComplete Error } > sd 0:0:0:0: SCSI error: return code = 0x8000002 > sda: Current: sense key: Medium Error > Additional sense: Unrecovered read error - auto reallocate failed > end_request: I/O error, dev sda, sector 61590960 > - : send the line "unsubscribe linux-ide" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html