Re: ata error EH in SWNCQ mode, with nVidia MCP55 sata controller and SAMSUNG HD103UJ

Robert Hancock <hancockrwd@xxxxxxxxx> · Tue, 05 Jan 2010 18:50:36 -0600

On 01/05/2010 11:56 AM, Marco Bisetto wrote:
Hi,

A problem with a "IDE interface: nVidia Corporation MCP55 SATA Controller
(rev a3)" and two SAMSUNG HD103UJ sata hard disk drives. Disabling write
cache on a disk gives error:

kernel: [   45.584445] ata1: EH in SWNCQ mode,QC:qc_active 0x1 sactive 0x1
kernel: [   45.584445] ata1: SWNCQ:qc_active 0x1 defer_bits 0x0
last_issue_tag 0x0
kernel: [   45.584445]   dhfis 0x1 dmafis 0x1 sdbfis 0x0
kernel: [   45.584445] ata1: ATA_REG 0x40 ERR_REG 0x0
kernel: [   45.584445] ata1: tag : dhfis dmafis sdbfis sacitve
kernel: [   45.584445] ata1: tag 0x0: 1 1 0 1
kernel: [   45.584445] ata1.00: exception Emask 0x0 SAct 0x1 SErr 0x0
action 0x6 frozen
kernel: [   45.584445] ata1.00: cmd
61/08:00:3f:f4:e8/00:00:02:00:00/40 tag 0 ncq 4096 out
kernel: [   45.584445]          res
40/00:00:00:00:00/00:00:00:00:00/40 Emask 0x4 (timeout)
kernel: [   45.584445] ata1.00: status: { DRDY }
kernel: [   45.584445] ata1: hard resetting link

The error appears four times for each disk at startup, only when a disk has
write cache disabled. For example, disabling write cache in two disks:

  45.595788] ata1: EH in SWNCQ mode,QC:qc_active 0x1 sactive 0x1
  45.595800] ata4: EH in SWNCQ mode,QC:qc_active 0x1 sactive 0x1
  76.491877] ata4: EH in SWNCQ mode,QC:qc_active 0x1 sactive 0x1
  76.511331] ata1: EH in SWNCQ mode,QC:qc_active 0x1 sactive 0x1
107.391075] ata4: EH in SWNCQ mode,QC:qc_active 0x1 sactive 0x1
107.423701] ata1: EH in SWNCQ mode,QC:qc_active 0x1 sactive 0x1
138.287465] ata4: EH in SWNCQ mode,QC:qc_active 0x1 sactive 0x1
138.332093] ata1: EH in SWNCQ mode,QC:qc_active 0x1 sactive 0x1

Disabling write cache on disk attached to ata4 and enabling it on disk
attached to ata1:

  45.583489] ata4: EH in SWNCQ mode,QC:qc_active 0x1 sactive 0x1
  76.479940] ata4: EH in SWNCQ mode,QC:qc_active 0x1 sactive 0x1
107.375643] ata4: EH in SWNCQ mode,QC:qc_active 0x1 sactive 0x1
138.272023] ata4: EH in SWNCQ mode,QC:qc_active 0x1 sactive 0x1

Enabling write cache on both disks = no errors.

I don't think the problem can be associated with bad cables or power
supply, as it happens in each channel, it is the same for each disk and
happens at the same time.

Anybody has ideas on what can it be and if there is a solution?

From what I can see, that debug output from sata_nv means that the 
drive hasn't reported it's completed the command (no SDB FIS) after the 
timeout (usually 30 seconds). That's an awfully long time. It could be 
that those drives have issues with NCQ and disabled write cache where 
some of the commands in the queue can be starved for overly long periods..

CCing linux-ide.
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html