Strange arbitrary port resets on ICH9R with Seagate drives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello

I've just purchased a brand spanking new G33/ICH9R based system for use as a home fileserver with 4x ST3750840AS Seagate SATA drives as the main grunt drives.

The problem is that all of the seagate drives keep resetting, as this dmesg excerpt shows:

[ 2114.613486] ata5: exception Emask 0x10 SAct 0x0 SErr 0x4010000 action 0x2 frozen
[ 2114.613494] ata5: (irq_stat 0x00400040, connection status changed)
[ 2115.188869] ata5: waiting for device to spin up (8 secs)
[ 2116.832307] ata6: exception Emask 0x10 SAct 0x0 SErr 0x4010000 action 0x2 frozen
[ 2116.832314] ata6: (irq_stat 0x00400040, connection status changed)
[ 2117.405372] ata6: waiting for device to spin up (8 secs)
[ 2123.316046] ata5: soft resetting port
[ 2123.487789] ata5: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[ 2123.529172] ata5.00: ata_hpa_resize 1: sectors = 1465149168, hpa_sectors = 1465149168 [ 2123.587389] ata5.00: ata_hpa_resize 1: sectors = 1465149168, hpa_sectors = 1465149168
[ 2123.587395] ata5.00: configured for UDMA/133
[ 2123.587400] ata5: EH complete
[ 2123.587628] SCSI device sdb: 1465149168 512-byte hdwr sectors (750156 MB)
[ 2123.587862] sdb: Write Protect is off
[ 2123.587866] sdb: Mode Sense: 00 3a 00 00
[ 2123.588054] SCSI device sdb: write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[ 2125.532548] ata6: soft resetting port
[ 2125.704290] ata6: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[ 2125.751647] ata6.00: ata_hpa_resize 1: sectors = 1465149168, hpa_sectors = 1465149168 [ 2125.809858] ata6.00: ata_hpa_resize 1: sectors = 1465149168, hpa_sectors = 1465149168
[ 2125.809865] ata6.00: configured for UDMA/133
[ 2125.809869] ata6: EH complete
[ 2125.810182] SCSI device sdc: 1465149168 512-byte hdwr sectors (750156 MB)
[ 2125.810338] sdc: Write Protect is off
[ 2125.810342] sdc: Mode Sense: 00 3a 00 00
[ 2125.810527] SCSI device sdc: write cache: enabled, read cache: enabled, doesn't support DPO or FUA

Hardware:

00:00.0 Host bridge: Intel Corporation Unknown device 29c0 (rev 02)
00:02.0 VGA compatible controller: Intel Corporation Unknown device 29c2 (rev 02) 00:03.0 Communication controller: Intel Corporation Unknown device 29c4 (rev 02)
00:1a.0 USB Controller: Intel Corporation Unknown device 2937 (rev 02)
00:1a.1 USB Controller: Intel Corporation Unknown device 2938 (rev 02)
00:1a.2 USB Controller: Intel Corporation Unknown device 2939 (rev 02)
00:1a.7 USB Controller: Intel Corporation Unknown device 293c (rev 02)
00:1b.0 Audio device: Intel Corporation Unknown device 293e (rev 02)
00:1c.0 PCI bridge: Intel Corporation Unknown device 2940 (rev 02)
00:1c.3 PCI bridge: Intel Corporation Unknown device 2946 (rev 02)
00:1c.4 PCI bridge: Intel Corporation Unknown device 2948 (rev 02)
00:1d.0 USB Controller: Intel Corporation Unknown device 2934 (rev 02)
00:1d.1 USB Controller: Intel Corporation Unknown device 2935 (rev 02)
00:1d.2 USB Controller: Intel Corporation Unknown device 2936 (rev 02)
00:1d.7 USB Controller: Intel Corporation Unknown device 293a (rev 02)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 92)
00:1f.0 ISA bridge: Intel Corporation Unknown device 2916 (rev 02)
00:1f.2 SATA controller: Intel Corporation Unknown device 2922 (rev 02)
00:1f.3 SMBus: Intel Corporation Unknown device 2930 (rev 02)
02:00.0 SATA controller: JMicron Technologies, Inc. JMicron 20360/20363 AHCI Controller (rev 02) 02:00.1 IDE interface: JMicron Technologies, Inc. JMicron 20360/20363 AHCI Controller (rev 02) 03:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI Express Gigabit Ethernet controller (rev 01)

CPU is a Core2Duo E4400

The ICH9R is being run in AHCI mode, which is pretty much a necessity as I want hotplugging. NO accesses are being performed on the drives, the problems happened as soon as they were plugged in. Interestingly more information is dumped on boot when I think mdadm tries to access the drives - even though I only abortively tried to set up an array on them it still thinks there's raid superblocks on there or something.

[ 45.673182] ata6.00: exception Emask 0x50 SAct 0x1 SErr 0x4890800 action 0x2 frozen [ 45.673186] ata6.00: (irq_stat 0x08400040, interface fatal error, connection status changed) [ 45.673192] ata6.00: cmd 60/58:00:30:00:00/00:00:00:00:00/40 tag 0 cdb 0x0 data 45056 in [ 45.673193] res 40/00:00:30:00:00/00:00:00:00:00/40 Emask 0x50 (ATA bus error)

ATA bus error... riiight...

I also have an older Maxtor 6L300S0 that is acting as the OS/backup drive for the system. Plugging it in with exactly the same wires to the same ports = no errors. The Maxtor is completely happy running with NCQ. The SATA CDROM is completely happy. I limited the drives to 1.5Gbps, no difference in the results with or without.

In a limited attempt at bugfixing, I disabled NCQ by executing the following:

echo 1 > /sys/block/sd[bcde]/device/queue_depth

previously the file contained 31. The errors still occur even with no IO at all. They seem completely independent of IO transactions anyway: I can cat /dev/urandom > /dev/sd[bcde] quite happily without the kernel spewing errors at me, and similarly a read of the drives to /dev/null doesn't result in anything too dramatic.


Any ideas?
-
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Filesystems]     [Linux SCSI]     [Linux RAID]     [Git]     [Kernel Newbies]     [Linux Newbie]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Samba]     [Device Mapper]

  Powered by Linux