SATA disconnects

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

I have this AMD64 based machine that regularly (about once per 1.5 days) has problems with one of its SATA disks. It runs the 64 bit Etch (2.6.18-4-xen-amd64). Unfortunately all those values in the ata logs don't mean much to me and I don't know where to go look for their meaning.
Any explanation of what those codes mean is greatly appreciated!

The disk is connected to a VIA controller:
00:0f.0 RAID bus controller [0104]: VIA Technologies, Inc. VIA VT6420 SATA RAID Controller [1106:3149] (rev 80)

sata_via 0000:00:0f.0: version 2.0
GSI 18 sharing vector 0xB0 and IRQ 18
ACPI: PCI Interrupt 0000:00:0f.0[B] -> GSI 20 (level, low) -> IRQ 18
sata_via 0000:00:0f.0: routed to hard irq line 10
ata3: SATA max UDMA/133 cmd 0xD000 ctl 0xCC02 bmdma 0xC000 irq 18
ata4: SATA max UDMA/133 cmd 0xC800 ctl 0xC402 bmdma 0xC008 irq 18
scsi2 : sata_via
ata3: SATA link down 1.5 Gbps (SStatus 0 SControl 300)
ATA: abnormal status 0x7F on port 0xD007
scsi3 : sata_via
ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata4.00: ATA-7, max UDMA/133, 160086528 sectors: LBA
ata4.00: ata4: dev 0 multi count 16
ata4.00: configured for UDMA/133
 Vendor: ATA       Model: Maxtor 6Y080M0    Rev: YAR5
 Type:   Direct-Access                      ANSI SCSI revision: 05
SCSI device sdb: 160086528 512-byte hdwr sectors (81964 MB)
sdb: Write Protect is off
sdb: Mode Sense: 00 3a 00 00
SCSI device sdb: drive cache: write back
SCSI device sdb: 160086528 512-byte hdwr sectors (81964 MB)
sdb: Write Protect is off
sdb: Mode Sense: 00 3a 00 00
SCSI device sdb: drive cache: write back
sdb: sdb2 < sdb5 sdb6 >
sd 3:0:0:0: Attached scsi disk sdb

This is what's in syslog:
Apr 27 11:31:45 quark kernel: ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
Apr 27 11:31:45 quark kernel: ata4.00: (BMDMA stat 0x1)
Apr 27 11:31:45 quark kernel: ata4.00: tag 0 cmd 0xca Emask 0x4 stat 0x40 err 0x0 (timeout) Apr 27 11:31:46 quark kernel: ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 300) Apr 27 11:31:53 quark kernel: ata4: port is slow to respond, please be patient
Apr 27 11:32:16 quark kernel: ata4: port failed to respond (30 secs)
Apr 27 11:32:16 quark kernel: ata4: soft resetting port
Apr 27 11:32:16 quark kernel: ATA: abnormal status 0xD0 on port 0xC807
Apr 27 11:32:16 quark last message repeated 5 times
Apr 27 11:32:46 quark kernel: ata4.00: qc timeout (cmd 0xec)
Apr 27 11:32:46 quark kernel: ata4.00: failed to IDENTIFY (I/O error, err_mask=0x4)
Apr 27 11:32:46 quark kernel: ata4.00: revalidation failed (errno=-5)
Apr 27 11:32:46 quark kernel: ata4: failed to recover some devices, retrying in 5 secs Apr 27 11:32:52 quark kernel: ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 300) Apr 27 11:32:59 quark kernel: ata4: port is slow to respond, please be patient
Apr 27 11:33:22 quark kernel: ata4: port failed to respond (30 secs)
Apr 27 11:33:22 quark kernel: ata4: soft resetting port
Apr 27 11:33:22 quark kernel: ATA: abnormal status 0xD0 on port 0xC807
Apr 27 11:33:22 quark last message repeated 5 times
Apr 27 11:33:52 quark kernel: ata4.00: qc timeout (cmd 0xec)
Apr 27 11:33:52 quark kernel: ata4.00: failed to IDENTIFY (I/O error, err_mask=0x4)
Apr 27 11:33:52 quark kernel: ata4.00: revalidation failed (errno=-5)
Apr 27 11:33:52 quark kernel: ata4: failed to recover some devices, retrying in 5 secs Apr 27 11:33:57 quark kernel: ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 300) Apr 27 11:34:04 quark kernel: ata4: port is slow to respond, please be patient
Apr 27 11:34:27 quark kernel: ata4: port failed to respond (30 secs)
Apr 27 11:34:27 quark kernel: ata4: soft resetting port
Apr 27 11:34:27 quark kernel: ATA: abnormal status 0xD0 on port 0xC807
Apr 27 11:34:27 quark last message repeated 5 times
Apr 27 11:34:57 quark kernel: ata4.00: qc timeout (cmd 0xec)
Apr 27 11:34:57 quark kernel: ata4.00: failed to IDENTIFY (I/O error, err_mask=0x4)
Apr 27 11:34:57 quark kernel: ata4.00: revalidation failed (errno=-5)
Apr 27 11:34:57 quark kernel: ata4.00: disabled
Apr 27 11:34:58 quark kernel: ata4: EH complete
Apr 27 11:34:58 quark kernel: sd 3:0:0:0: SCSI error: return code = 0x00040000 Apr 27 11:34:58 quark kernel: end_request: I/O error, dev sdb, sector 138101809 Apr 27 11:34:58 quark kernel: sd 3:0:0:0: SCSI error: return code = 0x00040000 Apr 27 11:34:58 quark kernel: end_request: I/O error, dev sdb, sector 138101821

The last two lines keep repeating for different sectors.

Smartctl does not appear to have anything in its logs.

Interesting thing is that the disk reappears and operates normally after 'echo 1 > /sys/class/pci_bus/0000:0f.0/host3/rescan'.
Unfortunately, LVM won't reconnect to it.

This machine is running Xen in 1.5G with two guests. The failing disk provides storage via LVM to one of those.

I can have this machine running like this another week or so.

Any help is greatly appreciated!
Thanks,
Jan Evert

-
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Filesystems]     [Linux SCSI]     [Linux RAID]     [Git]     [Kernel Newbies]     [Linux Newbie]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Samba]     [Device Mapper]

  Powered by Linux