Re: IRQ issues with multiple SiI3114's on Kernel 3.2

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Jul 28, 2012 at 1:48 PM, Stirling Westrup <swestrup@xxxxxxxxx> wrote:
> On Sat, Jul 28, 2012 at 5:10 AM, Stan Hoeppner <stan@xxxxxxxxxxxxxxxxx> wrote:
>> On 7/27/2012 9:20 PM, Stirling Westrup wrote:
>>> On Fri, Jul 27, 2012 at 6:14 PM, Stirling Westrup <swestrup@xxxxxxxxx> wrote:
>>>> On Fri, Jul 27, 2012 at 1:24 PM, Stan Hoeppner <stan@xxxxxxxxxxxxxxxxx> wrote:
>>>>> On 7/27/2012 11:40 AM, Stirling Westrup wrote:
>>>>>
>>>>>> I recently purchased a large system for use as a backup server for a
>>>>>> pair of small businesses. It contains a boot drive plus 10 more
>>>>>> storage drives. Despite having three onboard SATA controllers, the
>>>>>> motherboard didn't have enough SATA connectors for all the drives, so
>>>>>> I installed a pair of identical SiI3114 raid cards to handle the extra
>>>>>> connections. It has a Sandy Bridge chipset, so I installed a 3.2
>>>>>> kernel.
>>>>>>
>>>>>> # uname -a
>>>>>> Linux ttt 3.2.0-0.bpo.2-amd64 #1 SMP Fri Jun 29 20:42:29 UTC 2012
>>>>>> x86_64 GNU/Linux
>>>>> ...
>>>>>> Okay, enough background. Here's the issue: I had no trouble building
>>>>>> and sync'ing the first array, but when I try to sync the second array,
>>>>>> I always get the following dmesg an hour or so into the process:
>>>>>>
>>>>>> irq 19: nobody cared (try booting with the "irqpoll" option)
>>>>>> [  346.120572] Pid: 1100, comm: md1_resync Not tainted
>>>>> 3.2.0-0.bpo.2-amd64 #1
>>>>>> [  346.120573] Call Trace:
>>>>>> ...
>>>>>> [  346.120697] handlers:
>>>>>> [  346.120699] [<ffffffffa00479e0>] ahci_interrupt
>>>>>> [  346.120702] [<ffffffffa02f17ec>] sil_interrupt
>>>>>> [  346.120703] Disabling IRQ #19
>>>>>> [  346.122145] sched: RT throttling activated
>>>>> ...
>>>>>> From this point onward syncing drops to a tiny fraction of its
>>>>>> previous speed. I've tried booting with 'irqpoll' as the error message
>>>>>> suggests, but it has had no effect. I'm really not sure if there is a
>>>>>> conflict between my two SiI3114's or between the SiI's and the Marvell
>>>>>> controller (although I've never had an issue with Marvell in the
>>>>>> past), nor how to go about diagnosing or fixing this.  I'll include a
>>>>>> full dmesg dump below, as well as my currently loaded modules. If
>>>>>> anyone wants any further info, just ask.
>>>>>
>>>>> Have you tried irqbalance to spread the interrupts across cores/cache
>>>>> domains? https://irqbalance.org/documentation.html
>>>>>
>>>>
>>>> Thanks for the tip! I installed irqbalance and rebooted the system,
>>>> and everything has been running smoothly for the last two hours. I'll
>>>> let everyone know tomorrow if it actually finished the full 20-hour
>>>> resync without incidence.
>>
>>> Alas, all it did was delay the IRQ error by a few hours. Does anyone
>>> else have any ideas about how I could tackle this?
>>
>> Try irqpoll and irqbalance together.  Also, which motherboard is this,
>> exact make/model please.  May be a BIOS issue.  Doesn't seem to be using
>> MSIs.  If the mobo and cards all support MSIs, enabling that may fix
>> this as well.
>>
>> Also what make/model are the SiI3114 cards?  PCIe or PCI?
>
> The motherboard is an Asus P8768-V Pro/Gen3 and has full MSI support,
> but the cards
> are labeled "Syba SiI 3114 PCI to 4 Port Sata 150" and don't support MSI.
>
>> Have you
>> tried different slot combinations?  Moving one card to a different slot
>> may get it routed to PCI INTB instead of INTA.  That may get it mapped
>> to something other than IRQ#19.  Updating the 3114 boards to their
>> latest firmware is worth a shot, if not there already.
>
> At one point the system was mapping the interrupt to IRQ#17, instead
> of 19, but it still failed.
>
> I haven't tried moving the cards to different slots or anything but,
> IIRC, it only has two PCI slots.
>
> I also have yet to try upgrading the BIOS of the mobo or updating the
> card firmware. I was hoping to have a better
> idea of what was going wrong before going down that route.
>
> I also wonder if it the problem could be kernel or libata related.
> (which is why I'm asking in this forum).
>

Okay, it looks like its a known hardware chipset problem, and was
first reported 6-months ago.

It affects all PCI cards in Asus Sandy-Bridge Motherboards. No known
fix as of yet.

https://lkml.org/lkml/2012/1/30/216
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Filesystems]     [Linux SCSI]     [Linux RAID]     [Git]     [Kernel Newbies]     [Linux Newbie]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Samba]     [Device Mapper]

  Powered by Linux