Oliver Neukum <oliver@xxxxxxxxxx> writes: > Am Donnerstag, 26. April 2012, 14:20:02 schrieb Bjørn Mork: >> This is a partial revert of >> >> commit 860e41a71 usb: cdc-wdm: Fix race between write and disconnect >> >> which caused lockups and assorted "general protection fault" and >> "scheduling while atomic" messages when concurrent nonblocking >> writes failed and were followed by an immediate disconnect. The >> problem was discovered while developing userspace software for this >> driver. This gave us a reliable way to reproduce the bug. >> > > We'd better find the cause. The very first possibility is that we have > a use after free. Please add debug printks to wdm_disconnect() > and wdm_write() which printk desc->count. Will do. I am still requesting that the revert goes in as an immediate fix in the mean time. The justification is that it is a regression (although an old one), and that the consequences are just too bad to let the bug live while searching for its root cause. I have already looked at this a couple of days without finding anything useful, so I do not expect to be able to pin it down by myself any time soon. I was hoping that someone more clueful than me could take a shot at this. The reproducability using Aleksander's tool is near 100% and all you need to test it is any QMI capable device. That includes any Gobi card, and many other newer 3G or LTE modems. Note that there is a significant delay (several seconds - up to a minute) before the actual lockup. Unloading the cdc-wdm driver during this stage does not prevent the bug, and does not cause any immediate crash either. So I believe this must be related to data living in the usb subsystem? Do we free a buffer referenced by an URB? > And could you provide an oops? A couple of the samples. Using qmi_wwan with cdc-wdm as a subdriver: [ 449.521234] qmi_wwan 5-4:1.8: Tx URB has been submitted index=8 [ 449.523575] qmi_wwan 5-4:1.8: wdm_release: cleanup [ 454.641172] BUG: unable to handle kernel NULL pointer dereference at 0000000000000010 [ 454.645101] IP: [<ffffffff8115888e>] shm_fault+0x12/0x18 [ 454.645101] PGD 23002a067 PUD 230613067 PMD 0 [ 454.645101] Oops: 0000 [#1] SMP [ 454.645101] CPU 0 [ 454.645101] Modules linked in: qmi_wwan(O) cdc_wdm(O) bridge iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack acpi_cpufreq mperf cpufreq_userspace cpufreq_stats cpufreq_conservative cpufreq_powersave xt_hl ip6t_LOG xt_multiport ip6table_filter iptable_filter ip6_tables ip_tables x_tables parport_pc ppdev lp parport rfcomm bnep binfmt_misc uinput microcode fuse nfsd nfs nfs_acl auth_rpcgss fscache lockd sunrpc kvm_intel kvm 8021q garp stp tun ext2 coretemp loop snd_hda_codec_conexant snd_hda_intel snd_hda_codec snd_hwdep snd_pcm_oss snd_mixer_oss joydev snd_pcm thinkpad_acpi snd_page_alloc nvram arc4 snd_seq_midi snd_seq_midi_event snd_rawmidi snd_seq snd_seq_device snd_timer sierra(O) usbserial btusb snd usbnet uvcvideo bluetooth iwlwifi mii videodev v4l2_compat_ioctl32 crc16 media psmouse mac80211 i915 iTCO_wdt battery iTCO_vendor_support i2c_i801 cfg80211 serio_raw evdev ac power_supply soundcore rfkill drm_kms_helper drm wmi i2c_algo_bit i2c_core video processor button ext3 mbcache jbd sha256_generic cryptd aes_x86_64 aes_generic cbc dm_crypt dm_mod nbd usb_storage uas sr_mod cdrom sd_mod crc_t10dif uhci_hcd ahci libahci libata ehci_hcd thermal thermal_sys usbcore scsi_mod e1000e usb_common [last unloaded: cdc_wdm] [ 454.738987] [ 454.738987] Pid: 4667, comm: wmaker Tainted: G O 3.2.0-2-amd64 #1 LENOVO 2776LEG/2776LEG [ 454.738987] RIP: 0010:[<ffffffff8115888e>] [<ffffffff8115888e>] shm_fault+0x12/0x18 [ 454.738987] RSP: 0000:ffff880231ac3d20 EFLAGS: 00010206 [ 454.738987] RAX: 0000000000000000 RBX: ffff88021ecb9978 RCX: ffff8802300d1230 [ 454.738987] RDX: 00007eff88da5000 RSI: ffff880231ac3d58 RDI: ffff88021ecb9978 [ 454.738987] RBP: 0000000000000029 R08: 0000000000000000 R09: 0000000000000029 [ 454.738987] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 [ 454.738987] R13: 0000000000000000 R14: 0000000000000000 R15: ffff8802300d1230 [ 454.738987] FS: 00007eff88e75740(0000) GS:ffff88023bc00000(0000) knlGS:0000000000000000 [ 454.738987] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 454.738987] CR2: 0000000000000010 CR3: 0000000230687000 CR4: 00000000000006f0 [ 454.738987] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 454.738987] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 454.738987] Process wmaker (pid: 4667, threadinfo ffff880231ac2000, task ffff8802302347b0) [ 454.738987] Stack: [ 454.738987] ffffffff810cdb58 0000000000000000 00007eff88da5000 ffff8802300d1230 [ 454.738987] ffff88023032d800 0000000131ac3dc0 ffffffff810f9620 ffff880200000029 [ 454.738987] 0000000000000000 00007eff88da5000 0000000000000000 ffffffff00000001 [ 454.738987] Call Trace: [ 454.882146] [<ffffffff810cdb58>] ? __do_fault+0xc8/0x3ac [ 454.882146] [<ffffffff810f9620>] ? do_sync_readv_writev+0x9a/0xd7 [ 454.882146] [<ffffffff810d00d0>] ? handle_pte_fault+0x298/0x79f [ 454.882146] [<ffffffff810cd6ad>] ? pte_offset_kernel+0x16/0x35 [ 454.882146] [<ffffffff8134c243>] ? do_page_fault+0x312/0x337 [ 454.882146] [<ffffffff81349835>] ? page_fault+0x25/0x30 [ 454.882146] Code: 89 ef e8 87 1d f9 ff 48 83 c4 48 5b 5d 41 5c 41 5d 41 5e 41 5f c3 90 90 90 48 8b 87 98 00 00 00 48 8b 80 a0 00 00 00 48 8b 40 18 <48> 8b 40 10 ff e0 48 8b 87 98 00 00 00 48 8b 80 a0 00 00 00 48 [ 454.882146] RIP [<ffffffff8115888e>] shm_fault+0x12/0x18 [ 454.882146] RSP <ffff880231ac3d20> [ 454.882146] CR2: 0000000000000010 [ 454.952349] ---[ end trace 5786e416cc26ff98 ]--- A different sample, using the cdc-wdm as a standalone driver (and an older version as you can see from the "cdc-wdm-176" names): [ 140.593274] cdc_wdm 6-4:1.8: cdc-wdm-176: USB WDM device [ 140.593559] cdc_wdm 6-4:1.19: cdc-wdm-175: USB WDM device [ 140.594005] cdc_wdm 6-4:1.20: cdc-wdm-174: USB WDM device [ 341.527338] cdc_wdm 6-4:1.8: nonzero urb status received: -EPIPE [ 341.527572] cdc_wdm 6-4:1.8: Error in flush path: -32 [ 344.048731] general protection fault: 0000 [#1] SMP [ 344.048850] CPU 0 [ 344.048890] Modules linked in: cdc_wdm(O) bridge iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack acpi_cpufreq mperf cpufreq_userspace cpufreq_stats cpufreq_conservative cpufreq_powersave xt_hl ip6t_LOG xt_multiport ip6table_filter ip6_tables iptable_filter ip_tables x_tables parport_pc ppdev lp parport rfcomm bnep binfmt_misc uinput microcode fuse nfsd nfs nfs_acl auth_rpcgss fscache lockd sunrpc kvm_intel kvm 8021q garp stp tun ext2 coretemp loop snd_hda_codec_conexant joydev snd_hda_intel snd_hda_codec snd_hwdep snd_pcm_oss snd_mixer_oss snd_pcm arc4 thinkpad_acpi nvram snd_page_alloc snd_seq_midi snd_seq_midi_event snd_rawmidi sierra(O) usbserial uvcvideo videodev iwlwifi snd_seq v4l2_compat_ioctl32 btusb usbnet bluetooth snd_seq_device media mii crc16 snd_timer mac80211 snd i915 cfg80211 psmouse iTCO_wdt ac battery serio_raw i2c_i801 evdev power_supply iTCO_vendor_support soundcore rfkill wmi drm_kms_helper drm i2c_algo_bit i2c_core video button processor ext3 mbcache jbd sha256_generic cryptd aes_x86_64 aes_generic cbc usb_storage uas dm_crypt dm_mod nbd sr_mod cdrom sd_mod crc_t10dif ahci libahci libata uhci_hcd thermal thermal_sys ehci_hcd scsi_mod e1000e usbcore usb_common [last unloaded: cdc_wdm] [ 344.051792] [ 344.051827] Pid: 2796, comm: Xorg Tainted: G O 3.2.0-2-amd64 #1 LENOVO 2776LEG/2776LEG [ 344.052001] RIP: 0010:[<ffffffffa01bdb3c>] [<ffffffffa01bdb3c>] drm_clflush_pages+0x70/0xc8 [drm] [ 344.052183] RSP: 0018:ffff880230c91ca8 EFLAGS: 00010286 [ 344.052277] RAX: d7d87f898ef74000 RBX: 6db6db6db6db6db7 RCX: 0000000000000000 [ 344.052400] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff88022fc24820 [ 344.052522] RBP: 0000160000000000 R08: ffff880230c91fd8 R09: ffff880230c91fd8 [ 344.052583] R10: d7d87f898ef74000 R11: ffff880000000000 R12: 0000000000000000 [ 344.052583] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000008 [ 344.052583] FS: 00007f51d1740880(0000) GS:ffff88023bc00000(0000) knlGS:0000000000000000 [ 344.052583] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 344.052583] CR2: 00007fe9942a5ad0 CR3: 0000000230ff2000 CR4: 00000000000006f0 [ 344.052583] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 344.052583] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 344.052583] Process Xorg (pid: 2796, threadinfo ffff880230c90000, task ffff88022fb387b0) [ 344.052583] Stack: [ 344.052583] 000000000000009c ffff88023144e800 0000000000000001 ffffffffa026ddfa [ 344.052583] ffff88023144e800 0000000000000000 00000000000002c3 ffff88023144e800 [ 344.052583] ffff880231c6e800 ffffffffa026e790 ffff88023144e800 ffff880231baa000 [ 344.052583] Call Trace: [ 344.052583] [<ffffffffa026ddfa>] ? i915_gem_object_set_to_cpu_domain+0xd3/0x114 [i915] [ 344.052583] [<ffffffffa026e790>] ? i915_gem_object_unbind+0x94/0x18a [i915] [ 344.052583] [<ffffffffa026e89d>] ? i915_gem_free_object_tail+0x17/0xdf [i915] [ 344.052583] [<ffffffffa01bfd94>] ? drm_gem_handle_create+0xc0/0xc0 [drm] [ 344.052583] [<ffffffff811aaade>] ? kref_put+0x3e/0x47 [ 344.052583] [<ffffffffa01bfaff>] ? drm_gem_object_unreference_unlocked+0x2b/0x3a [drm] [ 344.052583] [<ffffffffa01bfcc9>] ? drm_gem_handle_delete+0x79/0x84 [drm] [ 344.052583] [<ffffffffa01be61f>] ? drm_ioctl+0x289/0x35e [drm] [ 344.052583] [<ffffffffa01c0072>] ? drm_gem_destroy+0x3a/0x3a [drm] [ 344.052583] [<ffffffff811ae1dc>] ? timerqueue_del+0x53/0x63 [ 344.052583] [<ffffffff81061c96>] ? __remove_hrtimer+0x2a/0x82 [ 344.052583] [<ffffffff81106625>] ? do_vfs_ioctl+0x459/0x49a [ 344.052583] [<ffffffff811066b1>] ? sys_ioctl+0x4b/0x72 [ 344.052583] [<ffffffff8134e252>] ? system_call_fastpath+0x16/0x1b [ 344.052583] Code: 4b 48 8b 04 d7 48 85 c0 74 3f 41 ff 80 44 e0 ff ff 48 01 e8 31 c9 48 c1 f8 03 48 0f af c3 48 c1 e0 0c 4c 01 d8 41 89 ca 49 01 c2 <41> 0f ae 3a 44 0f b7 15 de d8 4c e1 44 01 d1 81 f9 ff 0f 00 00 [ 344.052583] RIP [<ffffffffa01bdb3c>] drm_clflush_pages+0x70/0xc8 [drm] [ 344.052583] RSP <ffff880230c91ca8> [ 344.057099] ---[ end trace 9a872e6a30ac51c6 ]--- Bjørn -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html