On Tue, Aug 30, 2011 at 12:11 PM, Clemens Buchacher <drizzd@xxxxxx> wrote: > Hi Mohammed, > > On Mon, Aug 29, 2011 at 08:42:33PM +0530, Mohammed Shafi wrote: >> >> >> But still, the interrupts come. Note that according to >> >> /proc/interrupts, the IRQ line is not shared with any other device. >> >> I did not manage to determine which interrupt it is exactly, >> >> because the device is not in a ready state (SC_OP_INVALID is set) >> >> when they happen (in either scenario that triggers the IRQ storm). >> >> And SC_OP_INVALID is cleared only much later in ath9k_start. >> >> >> >> So, I am at a loss. Any ideas? >> > >> > please provide the lspci -vvvxx. > > Please see below. thanks! > >> >> also looking at >> >> /sys/kernel/debug/ieee80211/phy0/ath9k$ sudo cat interrupt. > > Those interrupt counters are always zero, because ath_isr never > gets to the point where it would gather statistics. The interrupt > routine exits right at the start, because SC_OP_INVALID is still > set. yes it is, though not a good idea, just thinking of we could get some thing by not setting SC_OP_INVALID flag in ath_pci_probe(it was added to fix a panic, but it did not cause panic for me now). > > if (sc->sc_flags & SC_OP_INVALID) > return IRQ_NONE; > > By the time the invalid flag is cleared, the IRQ line has long > since been disabled, due to 10000 spurios interrupts during less > than 500 ms. > >> > hi, i think this will help, please get the message sudo modprobe ath9k >> > debug=0xffffffff. >> > few fatal PCI interrupt messages are based on ATH_DEBUG_ANY. > > Whenever I did that in the past, it just added lots of PDADC debug > messages. though we might get some PCI fatal interrupts. > >> we can also try to disable MIB interrupts though its handled properly >> now in ath9k >> >> http://www.kernel.org/pub/linux/kernel/people/mcgrof/patches/ath9k/2008-09-25/0001-ath9k-disable-MIB-interrupts-to-fix-interrupt-storm.patch > > But I am already disabling all interrupts by setting the mask to 0. > Unless there are some non-maskable ones? > > I wonder if the device is in some crashed state at this point. Is > it possible to reset the device in ath_pci_probe? i don't think ath_reset cannot be called > >> a recent commit, not sure this will help suspend/resume >> >> commit 0682c9b52bf51fbc67c4e79fcbdadcf70bd600f8 >> Author: Rajkumar Manoharan <rmanohar@xxxxxxxxxxxxxxxx> >> Date: Sat Aug 13 10:28:09 2011 +0530 >> >> ath9k: Fix rx overrun interrupt storm > > For the same reason as above, this patch does not touch any code > that would get executed. > >> > also this additional information might help: >> > in case have you seen this is happening in 32 bit also ? > > I have never had a 32-bit system on this machine. > >> > is this happening in wireless-testing Linux 3.1-rc3 ? or the latest >> > compat wireless? > > I think I tried last week, but I can try again. > >> > i did some preliminary testing, not able to recreate it. will try >> > further.thanks! > > Thanks for trying. Did you turn off network manager? As I described > here, it can make the bug go away. i am bit confused looking at bug comments. please correct me. the bug comments say that disabling/making the Network-Manager to sleep triggers the problem. > > [1] https://bugzilla.kernel.org/show_bug.cgi?id=39112#c5 > > Clemens > --- > > 02:00.0 Network controller: Atheros Communications Inc. AR9285 Wireless Network Adapter (PCI-Express) (rev 01) > Subsystem: AzureWave Device 1089 > Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- > Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- > Latency: 0, Cache Line Size: 64 bytes > Interrupt: pin A routed to IRQ 17 > Region 0: Memory at d2c00000 (64-bit, non-prefetchable) [size=64K] > Capabilities: [40] Power Management version 3 > Flags: PMEClk- DSI- D1+ D2- AuxCurrent=375mA PME(D0+,D1+,D2-,D3hot+,D3cold+) > Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- > Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit- > Address: 00000000 Data: 0000 > Capabilities: [60] Express (v2) Legacy Endpoint, MSI 00 > DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <512ns, L1 <64us > ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- > DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported- > RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop- > MaxPayload 128 bytes, MaxReadReq 512 bytes > DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend- > LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Latency L0 <512ns, L1 <64us > ClockPM- Surprise- LLActRep- BwNot- > LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+ > ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- > LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- > DevCap2: Completion Timeout: Not Supported, TimeoutDis+ > DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- > LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-, Selectable De-emphasis: -6dB > Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS- > Compliance De-emphasis: -6dB > LnkSta2: Current De-emphasis Level: -6dB > Capabilities: [100 v1] Advanced Error Reporting > UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- > UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- > UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- > CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr- > CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ > AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn- > Capabilities: [140 v1] Virtual Channel > Caps: LPEVC=0 RefClk=100ns PATEntryBits=1 > Arb: Fixed- WRR32- WRR64- WRR128- > Ctrl: ArbSelect=Fixed > Status: InProgress- > VC0: Caps: PATOffset=00 MaxTimeSlots=1 RejSnoopTrans- > Arb: Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256- > Ctrl: Enable+ ID=0 ArbSelect=Fixed TC/VC=ff > Status: NegoPending- InProgress- > Capabilities: [160 v1] Device Serial Number 00-15-17-ff-ff-24-14-12 > Capabilities: [170 v1] Power Budgeting <?> > Kernel driver in use: ath9k > Kernel modules: ath9k > 00: 8c 16 2b 00 07 00 10 00 01 00 80 02 10 00 00 00 > 10: 04 00 c0 d2 00 00 00 00 00 00 00 00 00 00 00 00 > 20: 00 00 00 00 00 00 00 00 00 00 00 00 3b 1a 89 10 > 30: 00 00 00 00 40 00 00 00 00 00 00 00 03 01 00 00 > -- shafi -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html