Hi Tejun, Sorry it took some time to respond. I went away for th holidays and just returned yesterday. We were using linux 2.6.14. I saw a message on one of the forums that suggested moving to linux 2.6.18 because of improved error handling to fix this problem. I tried that by moving the entire SCSI framework from 2.6.18 into 2.6.14 (The full 2.6.18 is not stable on our platform). Anyhow, after doing this difficult task I managed to get rid of the lockups but I still get error messages and drive 'stalls'. Unfortunately, I don't have them recorded anywhere because the error messages don't seem to hurt anything. The drive locks for about 30 seconds, the driver does a 'soft reset' and then the drive comes back alive. It's far from optimal but at least the system is usable. I will try to repeat the test and get the error messages again so I can send them to you but if you have any ideas before then please let me know. Cheers, Steve... --- Tejun Heo <htejun@xxxxxxxxx> wrote: > Hello, > > Steve Graham wrote: > > My name is Steve Graham and I work for a small > > startup. Our company is developing a server board > > with the Silicon Images 3512 and we are getting > some > > strange lockups during high levels of disk > activity. > > The test I'm currently running to cause the > problem is > > to run the following concurrently: 'nbench', > > 'tiobench', and an 'scp' of a 200Meg file to the > sata > > drive. Every so often I will get the following > > message: > > > > ata1: status=0x51 { DriveReady SeekComplete Error > } > > ata1: error=0x04 { DriveStatusError } > > Which kernel version are you running? > > > This doesn't mean the drive is locked up and > doesn't > > appear to have any side effects on its own but > > eventually I will get the above message that is > > immediately followed by the next block of messages > > that do result in a lockup: > > > > ata1: command 0x35 timeout, stat 0xd1 host_stat > 0x1 > > ata1: status=0xd1 { Busy } > > sd 0:0:0:0: SCSI error: return code = 0x8000002 > > sda: Current: sense key=0xb > > ASC=0x47 ASCQ=0x0 > > end_request: I/O error, dev sda, sector 17033103 > > ata1: Abnormal status 0xD1 on port 0xC001E087 > > ata1: Alternate status 0xD1 on port 0xC001E08A > > ata1: Error 0xd1 > > ata1: Abnormal status 0xD1 on port 0xC001E087 > > ata1: Alternate status 0xD1 on port 0xC001E08A > > ata1: Error 0xd1 > > ata1: Abnormal status 0xD1 on port 0xC001E087 > > ata1: Alternate status 0xD1 on port 0xC001E08A > > This is message from old error handling and doesn't > really contain much > useful info. Even if you have to use previous > kernel in production > system, providing error messages from 2.6.19 will > help chasing down the > cause. > > -- > tejun > __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com - To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html