Mark Lord wrote: > Mark Lord wrote: >> .. >> I triggered this by accident, issuing an IDENTIFY command >> which incorrectly specified ATA_PROT_NODATA. My error, for sure, >> but libata never recovered from the "stuck DRQ bit" that resulted. > ... >> sda: Mode Sense: 00 3a 00 00 >> SCSI device sda: write cache: enabled, read cache: enabled, doesn't >> support DPO or FUA >> ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen >> ata1.00: cmd ec/00:00:00:00:00/00:00:00:00:00/00 tag 0 cdb 0x0 data 0 >> res 58/00:00:00:00:00/00:00:00:00:00/00 Emask 0x2 (HSM violation) >> ata1: soft resetting port >> ata1.00: configured for UDMA/100 >> ata1: EH complete >> SCSI device sda: 312581808 512-byte hdwr sectors (160042 MB) >> sda: Write Protect is off >> sda: Mode Sense: 00 3a 00 00 >> SCSI device sda: write cache: enabled, read cache: enabled, doesn't >> support DPO or FUA >> ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen >> ata1.00: cmd ec/00:00:00:00:00/00:00:00:00:00/00 tag 0 cdb 0x0 data 0 >> res 58/00:00:00:00:00/00:00:00:00:00/00 Emask 0x2 (HSM violation) >> ata1: soft resetting port >> ata1.00: configured for UDMA/100 >> ata1: EH complete > ... > (over and over) > > Say.. is this problem as simple as excessive retries for an SG_IO command? > There shouldn't really be *any* retries here, and it should eventually > just fail the command rather than shut down the port. > > Or am I just reading the logs wrong? libata EH isn't trying to retry the command. It's trying to revalidate the device after resetting it to make sure that the device is still there and listening to commands. As the device fails to respond to reset and the following IDENTIFY, libata EH assumes that the device is dead one way or the other and gives up on the device after a few reset/revalidate retries. -- tejun - To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html