Hi, On Mon, May 26, 2008 at 6:41 AM, Justin Madru <jdm64@xxxxxxxxx> wrote: > Hi, > > I've been getting kernel crashes at random when a video file just starts to > play (using VLC). > As soon as the first frame shows, the system locks up hard (sometimes not > even alt+sysrq+b works). > > Just recently, when it crashed it was able to print an oops to the syslog. > The weird thing is that it says that it's a bug in mac80211? But I only have > the crash the instant a video file starts to play. (I have an Intel 3945 > wireles, and Intel i945 graphic card) > > BUG: unable to handle kernel NULL pointer dereference at 00000090 > IP: [<f89e721f>] :mac80211:ieee80211_associate+0x24f/0x610 > *pde = 00000000 > Oops: 0000 [#1] PREEMPT SMP > Modules linked in: i915 acpi_cpufreq cpufreq_powersave cpufreq_stats > cpufreq_userspace cpufreq_conservative container sbs sbshc ext3 jbd mbcache > arc4 ecb crypto_blkcipher rtc dcdbas cryptomgr crypto_algapi psmouse evdev > snd_hda_intel snd_pcm_oss snd_mixer_oss snd_pcm iwl3945 mac80211 snd_timer > crc32 snd_page_alloc video backlight output ac button battery intel_agp > reiserfs sr_mod cdrom sg ata_piix ehci_hcd uhci_hcd usbcore thermal > processor fan > > Pid: 1899, comm: iwl3945 Not tainted (2.6.26-rc3-git #1) > EIP: 0060:[<f89e721f>] EFLAGS: 00010246 CPU: 1 > EIP is at ieee80211_associate+0x24f/0x610 [mac80211] > EAX: 00000000 EBX: 00000000 ECX: 00000000 EDX: f7b85e38 > ESI: f7b85e84 EDI: ecc7122e EBP: f7bbdd34 ESP: f7bbdcc0 > DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 > Process iwl3945 (pid: 1899, ti=f7bbd000 task=f718d390 task.ti=f7bbd000) > Stack: f7b85e84 00000000 f7bbdd14 00000202 f7b85e38 f7b85800 f7f65f00 > 00000018 > f7bbdcfa 00000000 00000421 00000003 00000006 00000052 f7bbdd0c ecc7122c > f71593a4 00000000 f7bbde15 f7bbdd3c c0295679 303a3030 33623a66 3a31613a > Call Trace: > [_format_mac_addr+0x79/0x90] ? _format_mac_addr+0x79/0x90 > [sched_debug_show+0x9c6/0xcb0] ? sched_debug_show+0x9c6/0xcb0 > [<f89e7610>] ? ieee80211_auth_completed+0x30/0x40 [mac80211] > [<f89e7a73>] ? ieee80211_rx_mgmt_auth+0x303/0x4b0 [mac80211] > [hrtimer_start+0xc2/0x150] ? hrtimer_start+0xc2/0x150 > [hrtick_set+0x85/0x100] ? hrtick_set+0x85/0x100 > [jbd:schedule+0x364/0x8c0] ? schedule+0x364/0x870 > [<f89e7da7>] ? ieee80211_sta_rx_queued_mgmt+0x187/0xcb0 [mac80211] > [ext3:preempt_schedule+0x33/0x100] ? preempt_schedule+0x33/0x50 > [mac80211:dev_queue_xmit+0xa6/0x1f20] ? dev_queue_xmit+0xa6/0x330 > [mac80211:_spin_unlock_bh+0x18/0xb0] ? _spin_unlock_bh+0x18/0x20 > [<f89e33b7>] ? ieee80211_rx_bss_get+0xa7/0xc0 [mac80211] > [mac80211:skb_dequeue+0x4d/0x360] ? skb_dequeue+0x4d/0x70 > [<f89e960f>] ? ieee80211_sta_work+0x8f/0x760 [mac80211] > [hrtick_set+0xa7/0x100] ? hrtick_set+0xa7/0x100 > [jbd:schedule+0x364/0x8c0] ? schedule+0x364/0x870 > [run_workqueue+0x80/0x120] ? run_workqueue+0x80/0x120 > [<f89e9580>] ? ieee80211_sta_work+0x0/0x760 [mac80211] > [worker_thread+0x88/0xe0] ? worker_thread+0x88/0xe0 > [<c013ba80>] ? autoremove_wake_function+0x0/0x40 > [worker_thread+0x0/0xe0] ? worker_thread+0x0/0xe0 > [kthread+0x42/0x70] ? kthread+0x42/0x70 > [kthread+0x0/0x70] ? kthread+0x0/0x70 > [kernel_thread_helper+0x7/0x18] ? kernel_thread_helper+0x7/0x18 > ======================= > Code: c6 00 00 8b 55 9c 8b 4d c8 8b 42 70 88 41 01 8b 42 70 8b 7d c8 89 c1 > c1 e9 02 83 c7 02 f3 a5 89 c1 83 e1 03 74 02 f3 a4 8b 5d d0 <8b> 9b 90 00 00 > 00 85 db 89 5d d8 0f 84 6d 03 00 00 8b 7d cc 8b > EIP: [<f89e721f>] ieee80211_associate+0x24f/0x610 [mac80211] SS:ESP > 0068:f7bbdcc0 > ---[ end trace 7afccad6600bfa21 ]--- The code decodes to: 1d: f3 a5 rep movsl %ds:(%esi),%es:(%edi) 1f: 89 c1 mov %eax,%ecx 21: 83 e1 03 and $0x3,%ecx 24: 74 02 je 0x28 26: f3 a4 rep movsb %ds:(%esi),%es:(%edi) 28: 8b 5d d0 mov -0x30(%ebp),%ebx 0: 8b 9b 90 00 00 00 mov 0x90(%ebx),%ebx <---- BAM! 6: 85 db test %ebx,%ebx 8: 89 5d d8 mov %ebx,-0x28(%ebp) b: 0f 84 6d 03 00 00 je 0x37e 11: 8b 7d cc mov -0x34(%ebp),%edi 14: 8b .byte 0x8b Recompiling net/mac80211/mlme.c gives me that this happens on line 675. ieee80211_compatible_rates net/mac80211/mlme.c:675 ieee80211_send_assoc net/mac80211/mlme.c:767 ieee80211_associate net/mac80211/mlme.c:955 So it is in fact compatible_rates() that crashes (but hidden in your Oops because of heavy inlining). So looking at the latest changelog in linus/master, we have this change: commit 0d580a774b3682b8b2b5c89ab9b813d149ef28e7 Author: Helmut Schaa <hschaa@xxxxxxx> Date: Tue May 20 09:56:37 2008 +0200 mac80211: fix NULL pointer dereference in ieee80211_compatible_rates Fix a possible NULL pointer dereference in ieee80211_compatible_rates introduced in the patch "mac80211: fix association with some APs". If no bss is available just use all supported rates in the association request. Signed-off-by: Helmut Schaa <hschaa@xxxxxxx> Signed-off-by: John W. Linville <linville@xxxxxxxxxxxxx> So does applying/cherry-picking that fix your problem? (Patch attached, but not inlined.) Vegard -- "The animistic metaphor of the bug that maliciously sneaked in while the programmer was not looking is intellectually dishonest as it disguises that the error is the programmer's own creation." -- E. W. Dijkstra, EWD1036
commit 0d580a774b3682b8b2b5c89ab9b813d149ef28e7 Author: Helmut Schaa <hschaa@xxxxxxx> Date: Tue May 20 09:56:37 2008 +0200 mac80211: fix NULL pointer dereference in ieee80211_compatible_rates Fix a possible NULL pointer dereference in ieee80211_compatible_rates introduced in the patch "mac80211: fix association with some APs". If no bss is available just use all supported rates in the association request. Signed-off-by: Helmut Schaa <hschaa@xxxxxxx> Signed-off-by: John W. Linville <linville@xxxxxxxxxxxxx> diff --git a/net/mac80211/mlme.c b/net/mac80211/mlme.c index e470bf1..7cfd12e 100644 --- a/net/mac80211/mlme.c +++ b/net/mac80211/mlme.c @@ -730,7 +730,17 @@ static void ieee80211_send_assoc(struct net_device *dev, if (bss->wmm_ie) { wmm = 1; } + + /* get all rates supported by the device and the AP as + * some APs don't like getting a superset of their rates + * in the association request (e.g. D-Link DAP 1353 in + * b-only mode) */ + rates_len = ieee80211_compatible_rates(bss, sband, &rates); + ieee80211_rx_bss_put(dev, bss); + } else { + rates = ~0; + rates_len = sband->n_bitrates; } mgmt = (struct ieee80211_mgmt *) skb_put(skb, 24); @@ -761,10 +771,7 @@ static void ieee80211_send_assoc(struct net_device *dev, *pos++ = ifsta->ssid_len; memcpy(pos, ifsta->ssid, ifsta->ssid_len); - /* all supported rates should be added here but some APs - * (e.g. D-Link DAP 1353 in b-only mode) don't like that - * Therefore only add rates the AP supports */ - rates_len = ieee80211_compatible_rates(bss, sband, &rates); + /* add all rates which were marked to be used above */ supp_rates_len = rates_len; if (supp_rates_len > 8) supp_rates_len = 8;