Search Linux Wireless

Re: ath9k: irq storm after suspend/resume

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Aug 30, 2011 at 12:11 PM, Clemens Buchacher <drizzd@xxxxxx> wrote:
> Hi Mohammed,
>
> On Mon, Aug 29, 2011 at 08:42:33PM +0530, Mohammed Shafi wrote:
>>
>> >> But still, the interrupts come. Note that according to
>> >> /proc/interrupts, the IRQ line is not shared with any other device.
>> >> I did not manage to determine which interrupt it is exactly,
>> >> because the device is not in a ready state (SC_OP_INVALID is set)
>> >> when they happen (in either scenario that triggers the IRQ storm).
>> >> And SC_OP_INVALID is cleared only much later in ath9k_start.
>> >>
>> >> So, I am at a loss. Any ideas?
>> >
>> > please provide the lspci -vvvxx.
>
> Please see below.

thanks!

>
>> >> also looking at
>> >> /sys/kernel/debug/ieee80211/phy0/ath9k$ sudo cat interrupt.
>
> Those interrupt counters are always zero, because ath_isr never
> gets to the point where it would gather statistics. The interrupt
> routine exits right at the start, because SC_OP_INVALID is still
> set.

yes it is, though not a good idea, just thinking of we could get some
thing by not setting SC_OP_INVALID flag in ath_pci_probe(it was added
to fix a panic, but it did not cause panic for me now).

>
>        if (sc->sc_flags & SC_OP_INVALID)
>                return IRQ_NONE;
>
> By the time the invalid flag is cleared, the IRQ line has long
> since been disabled, due to 10000 spurios interrupts during less
> than 500 ms.
>
>> > hi, i think this will help, please get the message sudo modprobe ath9k
>> > debug=0xffffffff.
>> > few fatal PCI interrupt messages are based on ATH_DEBUG_ANY.
>
> Whenever I did that in the past, it just added lots of PDADC debug
> messages.

though we might get some PCI fatal interrupts.


>
>> we can also try to disable MIB interrupts though its handled properly
>> now in ath9k
>>
>> http://www.kernel.org/pub/linux/kernel/people/mcgrof/patches/ath9k/2008-09-25/0001-ath9k-disable-MIB-interrupts-to-fix-interrupt-storm.patch
>
> But I am already disabling all interrupts by setting the mask to 0.
> Unless there are some non-maskable ones?
>
> I wonder if the device is in some crashed state at this point. Is
> it possible to reset the device in ath_pci_probe?

i don't think ath_reset cannot be called

>
>> a recent commit, not sure this will help suspend/resume
>>
>> commit 0682c9b52bf51fbc67c4e79fcbdadcf70bd600f8
>> Author: Rajkumar Manoharan <rmanohar@xxxxxxxxxxxxxxxx>
>> Date:   Sat Aug 13 10:28:09 2011 +0530
>>
>>    ath9k: Fix rx overrun interrupt storm
>
> For the same reason as above, this patch does not touch any code
> that would get executed.
>
>> > also this additional information might help:
>> > in case have you seen this is happening in 32 bit also ?
>
> I have never had a 32-bit system on this machine.
>
>> > is this happening in wireless-testing  Linux 3.1-rc3 ? or the latest
>> > compat wireless?
>
> I think I tried last week, but I can try again.
>
>> > i did some preliminary testing, not able to recreate it. will try
>> > further.thanks!
>
> Thanks for trying. Did you turn off network manager? As I described
> here, it can make the bug go away.

i am bit confused looking at bug comments. please correct me.
the bug comments say that disabling/making the Network-Manager to
sleep triggers the problem.


>
> [1] https://bugzilla.kernel.org/show_bug.cgi?id=39112#c5
>
> Clemens
> ---
>
> 02:00.0 Network controller: Atheros Communications Inc. AR9285 Wireless Network Adapter (PCI-Express) (rev 01)
>        Subsystem: AzureWave Device 1089
>        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
>        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
>        Latency: 0, Cache Line Size: 64 bytes
>        Interrupt: pin A routed to IRQ 17
>        Region 0: Memory at d2c00000 (64-bit, non-prefetchable) [size=64K]
>        Capabilities: [40] Power Management version 3
>                Flags: PMEClk- DSI- D1+ D2- AuxCurrent=375mA PME(D0+,D1+,D2-,D3hot+,D3cold+)
>                Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
>        Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit-
>                Address: 00000000  Data: 0000
>        Capabilities: [60] Express (v2) Legacy Endpoint, MSI 00
>                DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <512ns, L1 <64us
>                        ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
>                DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
>                        RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop-
>                        MaxPayload 128 bytes, MaxReadReq 512 bytes
>                DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend-
>                LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Latency L0 <512ns, L1 <64us
>                        ClockPM- Surprise- LLActRep- BwNot-
>                LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+
>                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
>                LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
>                DevCap2: Completion Timeout: Not Supported, TimeoutDis+
>                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-
>                LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-, Selectable De-emphasis: -6dB
>                         Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
>                         Compliance De-emphasis: -6dB
>                LnkSta2: Current De-emphasis Level: -6dB
>        Capabilities: [100 v1] Advanced Error Reporting
>                UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>                UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>                UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
>                CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
>                CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
>                AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
>        Capabilities: [140 v1] Virtual Channel
>                Caps:   LPEVC=0 RefClk=100ns PATEntryBits=1
>                Arb:    Fixed- WRR32- WRR64- WRR128-
>                Ctrl:   ArbSelect=Fixed
>                Status: InProgress-
>                VC0:    Caps:   PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
>                        Arb:    Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
>                        Ctrl:   Enable+ ID=0 ArbSelect=Fixed TC/VC=ff
>                        Status: NegoPending- InProgress-
>        Capabilities: [160 v1] Device Serial Number 00-15-17-ff-ff-24-14-12
>        Capabilities: [170 v1] Power Budgeting <?>
>        Kernel driver in use: ath9k
>        Kernel modules: ath9k
> 00: 8c 16 2b 00 07 00 10 00 01 00 80 02 10 00 00 00
> 10: 04 00 c0 d2 00 00 00 00 00 00 00 00 00 00 00 00
> 20: 00 00 00 00 00 00 00 00 00 00 00 00 3b 1a 89 10
> 30: 00 00 00 00 40 00 00 00 00 00 00 00 03 01 00 00
>



-- 
shafi
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Host AP]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Linux Kernel]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Samba]     [Device Mapper]
  Powered by Linux