On 10/07/2010 11:45 AM, Ben Greear wrote:
On 10/07/2010 11:42 AM, Luis R. Rodriguez wrote:
On Thu, Oct 7, 2010 at 11:39 AM, Ben Greear<greearb@xxxxxxxxxxxxxxx>
wrote:
On 10/07/2010 11:29 AM, Luis R. Rodriguez wrote:
On Thu, Oct 7, 2010 at 11:14 AM, Johannes Berg
<johannes@xxxxxxxxxxxxxxxx> wrote:
On Thu, 2010-10-07 at 10:33 -0700, Ben Greear wrote:
In case it helps, here is a dump of where the corrupted SKB was
deleted.
I wonder, do you have a machine with a decent IOMMU? Adding IOMMU
debugging into the mix could help you figure out if it's a DMA
problem.
Ben, how much traffic are you RX'ing on these virtual interfaces?
I disabled my user-space application, and this script alone can
reproduce
the problem fairly quickly on my system. You will need to change some
of those first variables. Just start it and wait a few minutes and
watch the splats show on the console :)
Note that I am not generating any traffic, but the wpa_supplicants are
doing their thing of course...
I'm using the kernel found here:
http://dmz2.candelatech.com/git/gitweb.cgi?p=linux.wireless-testing.ct/.git;a=summary
It's latest wireless-testing with some of my own patches, and some
I've gathered from here an there. I doubt I'm causing this problem,
but if you can't reproduce it with this script on your kernels,
I can try with base wireless-testing or whatever you are using.
I'll run this now, but can you try a vanilla wireless-testing? I hear
the latest wireless-testing is borked so maybe try (git reset --hard
master-2010-09-29), its what I'm on.
You are liable to hit a bunch of those crashes I've been reporting
before you hit the DMA thing if you don't use latest (with Johanne's scan
locking patch).
I'm going to poke at IOMMU debugging and see what I find.
I'll start a compile of vanilla wireless-testing + scan fix as well.
Well, vanilla + scan patch locked pretty hard when I started the script.
I was able to get sysrq to dump the locks, but it didn't seem to complete
that and couldn't even dump more sysrq info after that.
Might be something entirely different, of course, and no idea if
this lock dump shows any real problem.
Oct 7 12:08:43 localhost kernel: SysRq : Show Locks Held
Oct 7 12:08:43 localhost kernel:
Oct 7 12:08:43 localhost kernel: Showing all locks held in the system:
Oct 7 12:08:43 localhost kernel: 3 locks held by kworker/0:0/4:
Oct 7 12:08:43 localhost kernel: #0: (events){+.+.+.}, at: [<c0443cea>] process_one_work+0x145/0x295
Oct 7 12:08:43 localhost kernel: #1: ((linkwatch_work).work){+.+.+.}, at: [<c0443cea>] process_one_work+0x145/0x295
Oct 7 12:08:43 localhost kernel: #2: (rtnl_mutex){+.+.+.}, at: [<c06da190>] rtnl_lock+0xf/0x11
Oct 7 12:08:43 localhost kernel: 3 locks held by kworker/u:1/38:
Oct 7 12:08:43 localhost kernel: #0: ((wiphy_name(local->hw.wiphy))){+.+.+.}, at: [<c0443cea>] process_one_work+0x145/0x295
Oct 7 12:08:43 localhost kernel: #1: ((&sta->ampdu_mlme.work)){+.+...}, at: [<c0443cea>] process_one_work+0x145/0x295
Oct 7 12:08:43 localhost kernel: #2: (&sta->ampdu_mlme.mtx){+.+...}, at: [<f8887180>] ieee80211_ba_session_work+0x52/0xbe [mac80211]
Oct 7 12:08:43 localhost kernel: 1 lock held by mingetty/1584:
Oct 7 12:08:43 localhost kernel: #0: (&tty->atomic_read_lock){+.+...}, at: [<c05f5e91>] n_tty_read+0x1d1/0x5ed
Oct 7 12:08:43 localhost kernel: 1 lock held by mingetty/1586:
Oct 7 12:08:43 localhost kernel: #0: (&tty->atomic_read_lock){+.+...}, at: [<c05f5e91>] n_tty_read+0x1d1/0x5ed
Oct 7 12:08:43 localhost kernel: 1 lock held by mingetty/1589:
Oct 7 12:08:43 localhost kernel: #0: (&tty->atomic_read_lock){+.+...}, at: [<c05f5e91>] n_tty_read+0x1d1/0x5ed
Oct 7 12:08:43 localhost kernel: 1 lock held by mingetty/1593:
Oct 7 12:08:43 localhost kernel: #0: (&tty->atomic_read_lock){+.+...}, at: [<c05f5e91>] n_tty_read+0x1d1/0x5ed
Oct 7 12:08:43 localhost kernel: 1 lock held by mingetty/1596:
Oct 7 12:08:43 localhost kernel: #0: (&tty->atomic_read_lock){+.+...}, at: [<c05f5e91>] n_tty_read+0x1d1/0x5ed
Oct 7 12:08:43 localhost kernel: 1 lock held by mingetty/1598:
Oct 7 12:08:43 localhost kernel: #0: (&tty->atomic_read_lock){+.+...}, at: [<c05f5e91>] n_tty_read+0x1d1/0x5ed
Oct 7 12:08:43 localhost kernel: 1 lock held by bash/1683:
Oct 7 12:08:43 localhost kernel: #0: (&tty->atomic_read_lock){+.+...}, at: [<c05f5e91>] n_tty_read+0x1d1/0x5ed
Oct 7 12:08:43 localhost kernel: 1 lock held by bash/1728:
Oct 7 12:08:43 localhost kernel: #0: (&tty->atomic_read_lock){+.+...}, at: [<c05f5e91>] n_tty_read+0x1d1/0x5ed
Oct 7 12:08:43 localhost kernel: 1 lock held by bash/1752:
Oct 7 12:08:43 localhost kernel: #0: (&tty->atomic_read_lock){+.+...}, at: [<c05f5e91>] n_tty_read+0x1d1/0x5ed
Oct 7 12:08:43 localhost kernel: 1 lock held by wpa_supplicant/2840:
Oct 7 12:08:43 localhost kernel: #0: (rtnl_mutex){+.+.+.}, at: [<c06da190>] rtnl_lock+0xf/0x11
Oct 7 12:08:43 localhost kernel: 1 lock held by wpa_supplicant/2842:
Oct 7 12:08:43 localhost kernel: #0: (rtnl_mutex){+.+.+.}, at: [<c06da190>] rtnl_lock+0xf/0x11
Oct 7 12:08:43 localhost kernel: 1 lock held by wpa_supplicant/2844:
Oct 7 12:08:43 localhost kernel: #0: (rtnl_mutex){+.+.+.}, at: [<c06da190>] rtnl_lock+0xf/0x11
Oct 7 12:08:43 localhost kernel: 1 lock held by wpa_supplicant/2846:
Oct 7 12:08:43 localhost kernel: #0: (rtnl_mutex){+.+.+.}, at: [<c06da190>] rtnl_lock+0xf/0x11
Oct 7 12:08:43 localhost kernel: 1 lock held by wpa_supplicant/2848:
Oct 7 12:08:43 localhost kernel: #0: (rtnl_mutex){+.+.+.}, at: [<c06da190>] rtnl_lock+0xf/0x11
Oct 7 12:08:43 localhost kernel: 1 lock held by wpa_supplicant/2850:
Oct 7 12:08:43 localhost kernel: #0: (rtnl_mutex){+.+.+.}, at: [<c06da190>] rtnl_lock+0xf/0x11
Oct 7 12:08:43 localhost kernel: 1 lock held by wpa_supplicant/2852:
Oct 7 12:08:43 localhost kernel: #0: (rtnl_mutex){+.+.+.}, at: [<c06da190>] rtnl_lock+0xf/0x11
Oct 7 12:08:43 localhost kernel: 1 lock held by wpa_supplicant/2854:
Oct 7 12:08:43 localhost kernel: #0: (rtnl_mutex){+.+.+.}, at: [<c06da190>] rtnl_lock+0xf/0x11
Oct 7 12:08:43 localhost kernel: 1 lock held by wpa_supplicant/2856:
Oct 7 12:08:43 localhost kernel: #0: (rtnl_mutex){+.+.+.}, at: [<c06da190>] rtnl_lock+0xf/0x11
OcSysRq : Show State
Process wpa_supplicant (pid: 2904, ti=f3946000 task=f4124a60 task.ti=f3946000)
Stack:
Call Trace:
Code: 73 18 8b 75 08 01 73 14 88 4c 02 1c 5b 5e 5d c3 55 89 e5 57 56 53 8b 5d 08 6b 72 04 3c ff 84 30 70 11 00 00 8b 79 10 6b 72 04 3c <8b> 7f 50 01 bc 30
SysRq : Show Locks Held
Process wpa_supplicant (pid: 2904, ti=f3946000 task=f4124a60 task.ti=f3946000)
Stack:
Call Trace:
Code: 73 18 8b 75 08 01 73 14 88 4c 02 1c 5b 5e 5d c3 55 89 e5 57 56 53 8b 5d 08 6b 72 04 3c ff 84 30 70 11 00 00 8b 79 10 6b 72 04 3c <8b> 7f 50 01 bc 30
Thanks,
Ben
--
Ben Greear <greearb@xxxxxxxxxxxxxxx>
Candela Technologies Inc http://www.candelatech.com
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html