On 2012-10-02 5:02 PM, Sven Eckelmann wrote: > On Tuesday 02 October 2012 07:06:03 Adrian Chadd wrote: >> Hm, there are still issues on Hornet? > > Yes, we still have problems with hornet. The issue I am trying to "fix" with > this patch is an interrupt storm on AR9330 devices with sta interface(s). > Random devices crash after getting a stacktrace reporting __report_bad_irq. > The crash either results in a reboot or hang of the device > > [ 952.950000] irq 2: nobody cared (try booting with the "irqpoll" option) > [ 952.950000] Call Trace: > [ 952.950000] [<8026ade8>] dump_stack+0x8/0x34 > [ 952.950000] [<800a75d0>] __report_bad_irq+0x44/0xf4 > [ 952.950000] [<800a78ec>] note_interrupt+0x200/0x2a4 > [ 952.950000] [<800a58c8>] handle_irq_event_percpu+0x19c/0x1e0 > [ 952.950000] [<800a86cc>] handle_percpu_irq+0x54/0x88 > [ 952.950000] [<800a501c>] generic_handle_irq+0x3c/0x4c > [ 952.950000] [<80064748>] do_IRQ+0x1c/0x34 > [ 952.950000] [<80062d6c>] ret_from_irq+0x0/0x4 > [ 952.950000] [<8007673c>] tasklet_action+0xb8/0xd4 > [ 952.950000] [<80076c24>] __do_softirq+0xa0/0x154 > [ 952.950000] [<80076e30>] do_softirq+0x48/0x68 > [ 952.950000] [<80076f94>] local_bh_enable+0x94/0xb0 > [ 952.950000] [<83406d60>] cfg80211_scan_done+0x670/0x6d0 [cfg80211] > [ 952.950000] > [ 952.950000] handlers: > [ 952.950000] [<83564d48>] ath_isr > [ 952.950000] Disabling IRQ #2 > > The test setup is using 30 AR9330 devices running OpenWRT 32727/33559. 32727 > is using compat-wireless-2012-04-17 (+ many OpenWRT patches) and 33559 is > running compat-wireless-2012-09-07 (+many more patches from Felix). 1 device > is running an open AP device (standard OpenWRT settings) and 29 devices are > trying to connect. Random devices will now fail. To debug this problem, I used > one devices with 8 vif devices and restarted the network script again and > again to force the recreation of the vif and reconnect. > > The stack trace doesn't seem to be very helpful. Therefore, I checked ath_isr > and noticed that the interrupts right before the device crash get the status 0 > from ar9003_hw_get_isr. Digging a little but further also revealed that the > interrupts in the interrupt storm also have async_cause 0 and sync_cause 0x20. > > This sync cause 0x20 isn't handled anywhere and may be the cause of the > hang/crash. At least this is the symptom which can be fixed without crashing > the system. I checked the AR933x datasheet, and it says that cause 0x20 is tx descriptor corruption. - Felix -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html