Re: Port multiplier resets

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Tejun Heo wrote:
Yan Seiner wrote:
I'm seeing errors on a new port multiplier install.  The mobo has a
J-Micron SATA controller: JMicron Technologies, Inc.  JMicron
20360/20363 AHCI Controller (rev 03) The port multiplier has a SIL
3726 chipset.  AFAICT, this is a supported comnbination.  The kernel
recognizes the chipset.  The system has an adaptec SCSI controller
w/ 2 15K SCSI drives, 6 internal SATA drives, and 3 external SATA
drives using the port multiplier. Because of a broken bios (ARGH!) I
have to power up the external drives after the system scans the SCSI
bus.  This seems to cause no problems.  Here's a log of a recent
boot (scroll down for the error I see after the system is booted):

Nov  2 05:26:07 selene kernel: [   28.691524] ata8: SATA max UDMA/133 abar m8192@0xfdcfe000 port 0xfdcfe180 irq 16
Nov  2 05:26:07 selene kernel: [   34.104580] ata8: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Nov  2 05:26:07 selene kernel: [   34.104580] ata8.15: Port Multiplier 1.1, 0x1095:0x3726 r23, 6 ports, feat 0x1/0x9
Nov  2 05:26:07 selene kernel: [   34.104580] ata8.00: hard resetting link
Nov  2 05:26:07 selene kernel: [   34.425000] ata8.00: SATA link up 1.5 Gbps (SStatus 113 SControl 320)
Nov  2 05:26:07 selene kernel: [   34.425003] ata8.01: hard resetting link
Nov  2 05:26:07 selene kernel: [   35.001232] ata8.01: SATA link up 1.5 Gbps (SStatus 113 SControl 320)
Nov  2 05:26:07 selene kernel: [   35.001234] ata8.02: hard resetting link
Nov  2 05:26:07 selene kernel: [   35.321389] ata8.02: SATA link down (SStatus 0 SControl 320)
Nov  2 05:26:07 selene kernel: [   35.321419] ata8.03: hard resetting link
Nov  2 05:26:07 selene kernel: [   35.869739] ata8.03: SATA link down (SStatus 0 SControl 320)
Nov  2 05:26:07 selene kernel: [   35.869769] ata8.04: hard resetting link
Nov  2 05:26:07 selene kernel: [   36.621491] ata8.04: SATA link up 1.5 Gbps (SStatus 113 SControl 320)
Nov  2 05:26:07 selene kernel: [   36.621493] ata8.05: hard resetting link
Nov  2 05:26:07 selene kernel: [   36.941025] ata8.05: SATA link up 1.5 Gbps (SStatus 113 SControl 320)
Nov  2 05:26:07 selene kernel: [   36.942522] ata8.00: ATA-8: ST31000340AS, SD15, max UDMA/133
Nov  2 05:26:07 selene kernel: [   36.942525] ata8.00: 1953525168 sectors, multi 0: LBA48 NCQ (depth 31/32)
Nov  2 05:26:07 selene kernel: [   36.944479] ata8.00: configured for UDMA/133
Nov  2 05:26:07 selene kernel: [   53.613888] ata8.01: failed to IDENTIFY (INIT_DEV_PARAMS failed, err_mask=0x80)
Nov  2 05:26:07 selene kernel: [   53.613888] ata8.04: failed to IDENTIFY (I/O error, err_mask=0x100)
Nov  2 05:26:07 selene kernel: [   53.613888] ata8: failed to recover some devices, retrying in 5 secs
Nov  2 05:26:07 selene kernel: [   61.420677] ata8.01: hard resetting link
Nov  2 05:26:07 selene kernel: [   66.420763] ata8.15: qc timeout (cmd 0xe4)
Nov  2 05:26:07 selene kernel: [   66.420770] ata8.01: failed to read SCR 2 (Emask=0x4)
Nov  2 05:26:07 selene kernel: [   66.420773] ata8.01: failed to read SCR 2 (Emask=0x40)
Nov  2 05:26:07 selene kernel: [   66.420775] ata8.01: COMRESET failed (errno=-5)
Nov  2 05:26:07 selene kernel: [   66.420804] ata8.01: failed to read SCR 0 (Emask=0x40)
Nov  2 05:26:07 selene kernel: [   66.420806] ata8.01: reset failed, giving up
Nov  2 05:26:07 selene kernel: [   66.420835] ata8.15: hard resetting link
Nov  2 05:26:07 selene kernel: [   68.284883] ata8.15: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Nov  2 05:26:07 selene kernel: [   68.285116] ata8.00: hard resetting link
Nov  2 05:26:07 selene kernel: [   68.604947] ata8.00: SATA link up 1.5 Gbps (SStatus 113 SControl 320)
Nov  2 05:26:07 selene kernel: [   68.604949] ata8.01: hard resetting link
Nov  2 05:26:07 selene kernel: [   69.329438] ata8.01: SATA link up 1.5 Gbps (SStatus 113 SControl 320)
Nov  2 05:26:07 selene kernel: [   69.329440] ata8.02: hard resetting link
Nov  2 05:26:07 selene kernel: [   69.649777] ata8.02: SATA link down (SStatus 0 SControl 320)
Nov  2 05:26:07 selene kernel: [   69.649807] ata8.03: hard resetting link
Nov  2 05:26:07 selene kernel: [   69.969439] ata8.03: SATA link down (SStatus 0 SControl 320)
Nov  2 05:26:07 selene kernel: [   69.969469] ata8.04: hard resetting link
Nov  2 05:26:07 selene kernel: [   70.813852] ata8.04: SATA link up 1.5 Gbps (SStatus 113 SControl 320)
Nov  2 05:26:07 selene kernel: [   70.813855] ata8.05: hard resetting link
Nov  2 05:26:07 selene kernel: [   71.377665] ata8.05: SATA link up 1.5 Gbps (SStatus 113 SControl 320)
Nov  2 05:26:07 selene kernel: [   71.381475] ata8.00: configured for UDMA/133
Nov  2 05:26:07 selene kernel: [   71.381475] ata8.01: ATA-8: ST31000340AS, SD15, max UDMA/133
Nov  2 05:26:07 selene kernel: [   71.381475] ata8.01: 1953525168 sectors, multi 0: LBA48 NCQ (depth 31/32)
Nov  2 05:26:07 selene kernel: [   71.381879] ata8.01: configured for UDMA/133
Nov  2 05:26:07 selene kernel: [   71.381879] ata8.04: ATA-8: ST31000340AS, SD15, max UDMA/133
Nov  2 05:26:07 selene kernel: [   71.381879] ata8.04: 1953525168 sectors, multi 0: LBA48 NCQ (depth 31/32)
Nov  2 05:26:07 selene kernel: [   71.381890] ata8.04: configured for UDMA/133
Nov  2 05:26:07 selene kernel: [   71.389769] ata8: EH complete

The link probably got a PHY event after reset sequence is complete.
Maybe our timinig is too aggressive or the PMP is just quirky but as
long as detection succeeds in the end, it should be okay.

Once the system is up and running, I get these errors.  They appear
anywhere from several hours to a few minutes apart.  What exactly
does this mean?  It doesn't seem to have any impact on the
performance of the drives.  Typically, these drives are very heavily
loaded; I've been dumping > 1TB of data to them while rebuilding the
RAID-5 array, so they're pretty well maxed out.

Nov  2 05:40:13 selene kernel: [  984.254543] ata8.15: exception Emask 0x10 SAct 0x0 SErr 0x780101 action 0x7
Nov  2 05:40:13 selene kernel: [  984.254549] ata8.15: irq_stat 0x0c000000
Nov  2 05:40:13 selene kernel: [  984.254552] ata8: SError: { RecovData UnrecovData 10B8B Dispar BadCRC Handshk }
Nov  2 05:40:13 selene kernel: [  984.254557] ata8.01: exception Emask 0x0 SAct 0x3 SErr 0x0 action 0x0

It indicates that the link experienced ATA bus transmission failure.
Heh... Not only that the controller turned on all possible link error
bits.  Hmmm... The second device has active commands and it's the same
device which caused problem during boot too.  Can you please try the
followings?

1. Swap the second drive with another one and see whether the error
   follows the drive or stays with the slot.

2. Specify libata.force=8.15:1.5Gbps

Sorry for the delayed response. I've installed another controller based on the SiI 3132 chipset. The esata array is the same; I've simply moved the esata cables to the new controller. My observations:

1. The JMicron 20360/20363 AHCI Controller (rev 03) controller is far, far slower. hdparm clocks it at 20 mb/sec and a raid check takes about 15 hours.

2. The JMicron controller experiences the above errors.

3. The Sil 3132 controller is faster. hdparm says 110 mb/sec and a raid check takes about 5 hours. This is confirmed by usage - I can stream 2 videos, record 2 videos, and commercial flag a video in myth with no visible stuttering. The JMicron controller stutters even with a single video streaming and recording.

4.  The Sil 3132 controller doesn't have any errors even under high load.

Google says the rev 3 version of the JMicron controller has these issues, and to turn off NCQ on it. It also indicates that these problems should have been resolved in my kernel - version 2.6.27.4 - but apparently not. :-(

Is there a way to turn off NCQ on that controller without impacting the Sil 3132 controller? Any other things to try?

--Yan
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Filesystems]     [Linux SCSI]     [Linux RAID]     [Git]     [Kernel Newbies]     [Linux Newbie]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Samba]     [Device Mapper]

  Powered by Linux