[Hm, linux-scsi ought to be cc'd on this...] mike.redan@xxxxxxx wrote: >> Here they are: >> Nov 10 02:08:08 192.168.207.10/192.168.207.10 kernel: sd 0:0:0:0: SCSI >> error: return code = 0x00070000 >> Nov 10 02:08:08 192.168.207.10/192.168.207.10 kernel: end_request: I/O >> error, dev sda, sector 77429847 > > Yep, I've seen that now too. It looks to me like we're getting > DID_ERROR for some reason. The only reason for that in the libata code > seems to deal with bad SCSI commands and/or memory allocation problems, > but I'll keep digging. These errors are memory allocation problems in libata. When I plug a whole lot of SAS and SATA disks into my x260 and run the pounder stress test, the amount of buffers on my system increases over a period of about twenty minutes until libata can no longer allocate ata_queued_cmd structures. At this point we start seeing the errors above. Since we can't allocate new commands, libsas/aic94xx never even get called, which is why they are silent on the matter. However, if I kill pounder before totally running out of memory, the amount of buffers will decrease very rapidly and the system is ok. So, a question to you, Mr. Redan: What does /proc/meminfo look like at crash time? If you have a huge amount of buffers, then we're seeing the same thing. And a question for everyone else: Because the buffers drain out fairly quickly after pounder dies, does this mean that the controller is being subjected to too much I/O at once? --D - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html