Re: Libata EH False Alarm (2.6.18-rc2)?

"Fajun Chen" <fajunchen@xxxxxxxxx> · Fri, 25 Aug 2006 08:41:24 -0600

Hi Tejun,

I tested your patch. No more EH alarms in dmesg during an overnight
test.  One question I have is whether this fix will mask genuine
failure.

Thanks,
Fajun

On 8/24/06, Tejun Heo <htejun@xxxxxxxxx> wrote:
Tejun Heo wrote:
> Fajun Chen wrote:
>> Sil3124. That's the only chipset we use.
>
>>> > [30540.003174] ata1: exception Emask 0x10 SAct 0x0 SErr 0x80000 action
>>> > 0x2 frozen
>>> > [30540.003259] ata1: (irq_stat 0x01100010, PHY RDY changed)
>
> Yeap, this message from sata_sil24.  You're not getting any phy status
> changes bits in SError although the device is reporting phy rdy changed
> event.  However, your 3124 is reporting 8b/10b decoding error threshold
> exceeded error interrupt.  That could be related to the phyrdy status
> changed event.  This happens only under heavy IO, right?  How often does
> it occur in units of times per megabytes transferred?
>
> 8b/10b error is a recoverable FIS reception error.  The interrupt bit
> (bit 24 of irq_stat) is only turned on if threshold count is exceeded,
> which is initialized to 0x8000 at the moment.  This indicates that there
> are quite some number of transmission failures.
>

Sorry, I forgot to attach patch.  Can you please try the attached patch?

--
tejun

--- a/drivers/scsi/sata_sil24.c
+++ b/drivers/scsi/sata_sil24.c
@@ -1034,9 +1034,9 @@ static void sil24_init_controller(struct
                        writel(PORT_CS_IRQ_WOC, port + PORT_CTRL_CLR);

                /* Zero error counters. */
-               writel(0x8000, port + PORT_DECODE_ERR_THRESH);
-               writel(0x8000, port + PORT_CRC_ERR_THRESH);
-               writel(0x8000, port + PORT_HSHK_ERR_THRESH);
+               writel(0x0000, port + PORT_DECODE_ERR_THRESH);
+               writel(0x0000, port + PORT_CRC_ERR_THRESH);
+               writel(0x0000, port + PORT_HSHK_ERR_THRESH);
                writel(0x0000, port + PORT_DECODE_ERR_CNT);
                writel(0x0000, port + PORT_CRC_ERR_CNT);
                writel(0x0000, port + PORT_HSHK_ERR_CNT);



-
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html