Re: SATA driver sata_sil24

Robert Hancock <hancockrwd@xxxxxxxxx> · Thu, 29 Apr 2010 08:43:46 -0600



On Thu, Apr 29, 2010 at 3:14 AM, Richard Mawson
<richard@xxxxxxxxxxxxxxxxxxx> wrote:
> Tim,
>
> On Mon, Apr 26, 2010 at 12:59:05PM +0100, Tim Small wrote:
>> If you want to try to debug this further - you could turn on PCI parity
>> error detection (either using EDAC module, or via userspace with
>> lspci/setpci)?
>>
>> # modprobe  edac_core
>> # echo 1 > /sys/module/edac_core/parameters/check_pci_errors
>>
>> If you're after a different solution for that machine, you can buy Sii
>> 3124 based cards (PCI-X to 4x SATA) for about the same price as that
>> adaptor....
>>
>> http://www.siliconimage.com/products/product.aspx?pid=27
>
> Thanks for your suggestions.
>
> I'm not too familiar with debugging pci errors, but I'm willing to try things
> out if there are suggestions as to what to look for.
>
> Having moves this to another system, still using the pci-pcie bridge, there
> are problems too -- it just takes longer to show up. The system locks up when
> copying large quantities of data to the disks.
>
> The symptom is the following code in the interrupt handler being called many
> many times:
>
>        if (status == 0xffffffff) {
>                printk(KERN_ERR DRV_NAME ": IRQ status == 0xffffffff, "
>                       "PCI fault or device removal?\n");
>
> Does this indicate a hardware error? Is there a safe way to reset the device
> in this state to avoid the repeated calls to the interrupt handler that I
> suspect is the cause of the machine being unresponsive?
>
> I'm looking into pci debugging techniques, but any pointers would be welcome.

Register reads returning all 1s would indicate that there are likely
PCI aborts happening - could be either the bridge or the chip itself
has stopped responding.
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html