On Sun, Sep 02, 2012 at 11:24:47PM -0700, Gregoire Gentil wrote: > Hello, > > I'm trying to debug a wifi bug with 3.4-rt17 applied, running on an > OMAP4 ARM board such as Pandaboard. > > Wi-Fi works perfectly well without rt patches. It also works quite > well with rt patches AND without wifi module loaded. But with both > rt patches and wifi module, the system is very flaky and even if I > manage to launch a big download, I get a kernel hang. I managed to > get a trace: > > BUG: scheduling while atomic: irq/213-wl12xx/1588/0x00010002 > Modules linked in: omapdce(C) wl12xx wlcore omaprpc(C) mac80211 d > [<c001beb4>] (unwind_backtrace+0x0/0xf0) from [<c0613548>] (dump) > [<c0613548>] (dump_stack+0x20/0x24) from [<c0073908>] (__schedul) > [<c0073908>] (__schedule_bug+0x54/0x60) from [<c0614818>] (__sch) > [<c0614818>] (__schedule+0x74/0x6c0) from [<c0614f60>] (schedule) > [<c0614f60>] (schedule+0xa0/0xb8) from [<c0615eb8>] (rt_spin_loc) > [<c0615eb8>] (rt_spin_lock_slowlock+0x198/0x288) from [<c06160a8) > [<c06160a8>] (rt_spin_lock+0x18/0x1c) from [<bf0c6b24>] (wl12xx_) > [<bf0c6b24>] (wl12xx_hardirq+0x2c/0xa4 [wlcore]) from [<c00bd4f0) > [<c00bd4f0>] (handle_irq_event_percpu+0xac/0x24c) from [<c00bd70) > [<c00bd70c>] (handle_irq_event+0x7c/0x9c) from [<c00c08f0>] (han) > [<c00c08f0>] (handle_level_irq+0xe4/0x134) from [<c00bcf58>] (ge) > [<c00bcf58>] (generic_handle_irq+0x34/0x3c) from [<c03217e8>] (g) > [<c03217e8>] (gpio_irq_handler+0x160/0x1a4) from [<c00bcf58>] (g) > [<c00bcf58>] (generic_handle_irq+0x34/0x3c) from [<c001449c>] (h) > [<c001449c>] (handle_IRQ+0x88/0xc8) > > Source code including the function wl12xx_hardirq is here: > http://dev.omapzoom.org/?p=integration/kernel-ubuntu.git;a=blob;f=drivers/net/wireless/ti/wlcore/main.c;h=45fe911a6504f92dddff5a9415bb77a643b3c4a9;hb=f84c72f6b36418ff11d16808c16a7c3216730bb0 > > Any idea what could be wrong and how I could debug and fix this situation? On first glance, it looks like the driver uses request_threaded_irq(), to register its handlers, but is trying to acquire a regular spin_lock in its primary handler. That's bad news, since spin_locks' can schedule() when contended with CONFIG_PREEMPT_RT. And it's not just that, unfortunately, since the primary handler also complete()s a completion, which also can schedule(). It looks like the overall interrupt handling strategy of this driver probably needs to be revisited. :(. -- joshc
Attachment:
pgpTLVrAU8pcX.pgp
Description: PGP signature