Re: ahci timeouts, retries etc.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10/14/2009 10:51 AM, Tim Small wrote:
Hi,

I have a Tyan S5375 (BIOS v1.03) ICH9 which periodically (approx twice a
week) logs timeouts like this:

[6475755.652262] ata2.00: exception Emask 0x0 SAct 0x3832 SErr 0x0
action 0x6 frozen
[6475755.652262] ata2.00: cmd 60/18:08:2a:90:ee/00:00:12:00:00/40 tag 1
ncq 12288 in
[6475755.652262] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4
(timeout)
[6475755.652262] ata2.00: status: { DRDY }
[6475755.652262] ata2.00: cmd 61/60:20:6a:8c:ee/00:00:12:00:00/40 tag 4
ncq 49152 out
[6475755.652262] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4
(timeout)
...
[6475755.652262] ata2.00: cmd 60/10:68:6a:65:ee/00:00:12:00:00/40 tag 13
ncq 8192 in
[6475755.652262] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4
(timeout)
[6475755.652262] ata2.00: status: { DRDY }
[6475755.652262] ata2: hard resetting link
[6475756.009863] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
[6475756.040731] ata2.00: configured for UDMA/133
[6475756.040731] sd 1:0:0:0: [sdb] 1953525168 512-byte hardware sectors
(1000205 MB)
[6475756.040731] sd 1:0:0:0: [sdb] Write Protect is off
[6475756.040731] sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
[6475756.040731] sd 1:0:0:0: [sdb] Write cache: enabled, read cache:
enabled, doesn't support DPO or FUA


A look at the libata wiki suggests interrupt delivery problems as a
possible explanation, but is this likely to be the case here? I'm
guessing that multiple interrupts must have been dropped by the time
this error has occurred, as multiple requests are queued for the drive?

Interrupt delivery doesn't seem too likely here - it normally either works or it doesn't, it doesn't randomly fail once in a while..


I'm assuming that the kernel will retry these requests after the sata
link has been reset?

Yes.


The errors appear to be randomly distributed over the four drives on
this machine - all are Seagate ST31000340NS with either firmware version
SN05 or SN16...

This kind of problem often seems to be due to signal integrity or power problems. For whatever reason, an insufficient power supply (or something like overloading one power cable) can tend to trigger SATA errors as an early symptom..
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Filesystems]     [Linux SCSI]     [Linux RAID]     [Git]     [Kernel Newbies]     [Linux Newbie]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Samba]     [Device Mapper]

  Powered by Linux