Re: [PATCH] usb: cdc-wdm: Fix "scheduling while atomic" after failed write

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Oliver Neukum <oliver@xxxxxxxxxx> writes:
> Am Donnerstag, 26. April 2012, 14:20:02 schrieb Bjørn Mork:
>> This is a partial revert of
>> 
>>   commit 860e41a71 usb: cdc-wdm: Fix race between write and disconnect
>> 
>> which caused lockups and assorted "general protection fault" and
>> "scheduling while atomic" messages when concurrent nonblocking
>> writes failed and were followed by an immediate disconnect.  The
>> problem was discovered while developing userspace software for this
>> driver. This gave us a reliable way to reproduce the bug.
>> 
>
> We'd better find the cause. The very first possibility is that we have
> a use after free. Please add debug printks to wdm_disconnect()
> and wdm_write() which printk desc->count.

Will do.

I am still requesting that the revert goes in as an immediate fix in the
mean time. The justification is that it is a regression (although an old
one), and that the consequences are just too bad to let the bug live
while searching for its root cause.

I have already looked at this a couple of days without finding anything
useful, so I do not expect to be able to pin it down by myself any time
soon.  I was hoping that someone more clueful than me could take a shot
at this.  The reproducability using Aleksander's tool is near 100% and
all you need to test it is any QMI capable device.  That includes any
Gobi card, and many other newer 3G or LTE modems.

Note that there is a significant delay (several seconds - up to a
minute) before the actual lockup. Unloading the cdc-wdm driver during
this stage does not prevent the bug, and does not cause any immediate
crash either.  So I believe this must be related to data living in the
usb subsystem?  Do we free a buffer referenced by an URB?

> And could you provide an oops?

A couple of the samples.  Using qmi_wwan with cdc-wdm as a subdriver:

[  449.521234] qmi_wwan 5-4:1.8: Tx URB has been submitted index=8
[  449.523575] qmi_wwan 5-4:1.8: wdm_release: cleanup
[  454.641172] BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
[  454.645101] IP: [<ffffffff8115888e>] shm_fault+0x12/0x18
[  454.645101] PGD 23002a067 PUD 230613067 PMD 0 
[  454.645101] Oops: 0000 [#1] SMP 
[  454.645101] CPU 0 
[  454.645101] Modules linked in: qmi_wwan(O) cdc_wdm(O) bridge iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack acpi_cpufreq mperf cpufreq_userspace cpufreq_stats cpufreq_conservative cpufreq_powersave xt_hl ip6t_LOG xt_multiport ip6table_filter iptable_filter ip6_tables ip_tables x_tables parport_pc ppdev lp parport rfcomm bnep binfmt_misc uinput microcode fuse nfsd nfs nfs_acl auth_rpcgss fscache lockd sunrpc kvm_intel kvm 8021q garp stp tun ext2 coretemp loop snd_hda_codec_conexant snd_hda_intel snd_hda_codec snd_hwdep snd_pcm_oss snd_mixer_oss joydev snd_pcm thinkpad_acpi snd_page_alloc nvram arc4 snd_seq_midi snd_seq_midi_event snd_rawmidi snd_seq snd_seq_device snd_timer sierra(O) usbserial btusb snd usbnet uvcvideo bluetooth iwlwifi mii videodev v4l2_compat_ioctl32 crc16 media psmouse mac80211 i915 iTCO_wdt battery iTCO_vendor_support i2c_i801 cfg80211 serio_raw evdev ac power_supply soundcore rfkill drm_kms_helper drm wmi i2c_algo_bit i2c_core video processor button ext3 mbcache jbd sha256_generic cryptd aes_x86_64 aes_generic cbc dm_crypt dm_mod nbd usb_storage uas sr_mod cdrom sd_mod crc_t10dif uhci_hcd ahci libahci libata ehci_hcd thermal thermal_sys usbcore scsi_mod e1000e usb_common [last unloaded: cdc_wdm]
[  454.738987] 
[  454.738987] Pid: 4667, comm: wmaker Tainted: G           O 3.2.0-2-amd64 #1 LENOVO 2776LEG/2776LEG
[  454.738987] RIP: 0010:[<ffffffff8115888e>]  [<ffffffff8115888e>] shm_fault+0x12/0x18
[  454.738987] RSP: 0000:ffff880231ac3d20  EFLAGS: 00010206
[  454.738987] RAX: 0000000000000000 RBX: ffff88021ecb9978 RCX: ffff8802300d1230
[  454.738987] RDX: 00007eff88da5000 RSI: ffff880231ac3d58 RDI: ffff88021ecb9978
[  454.738987] RBP: 0000000000000029 R08: 0000000000000000 R09: 0000000000000029
[  454.738987] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[  454.738987] R13: 0000000000000000 R14: 0000000000000000 R15: ffff8802300d1230
[  454.738987] FS:  00007eff88e75740(0000) GS:ffff88023bc00000(0000) knlGS:0000000000000000
[  454.738987] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  454.738987] CR2: 0000000000000010 CR3: 0000000230687000 CR4: 00000000000006f0
[  454.738987] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  454.738987] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  454.738987] Process wmaker (pid: 4667, threadinfo ffff880231ac2000, task ffff8802302347b0)
[  454.738987] Stack:
[  454.738987]  ffffffff810cdb58 0000000000000000 00007eff88da5000 ffff8802300d1230
[  454.738987]  ffff88023032d800 0000000131ac3dc0 ffffffff810f9620 ffff880200000029
[  454.738987]  0000000000000000 00007eff88da5000 0000000000000000 ffffffff00000001
[  454.738987] Call Trace:
[  454.882146]  [<ffffffff810cdb58>] ? __do_fault+0xc8/0x3ac
[  454.882146]  [<ffffffff810f9620>] ? do_sync_readv_writev+0x9a/0xd7
[  454.882146]  [<ffffffff810d00d0>] ? handle_pte_fault+0x298/0x79f
[  454.882146]  [<ffffffff810cd6ad>] ? pte_offset_kernel+0x16/0x35
[  454.882146]  [<ffffffff8134c243>] ? do_page_fault+0x312/0x337
[  454.882146]  [<ffffffff81349835>] ? page_fault+0x25/0x30
[  454.882146] Code: 89 ef e8 87 1d f9 ff 48 83 c4 48 5b 5d 41 5c 41 5d 41 5e 41 5f c3 90 90 90 48 8b 87 98 00 00 00 48 8b 80 a0 00 00 00 48 8b 40 18 <48> 8b 40 10 ff e0 48 8b 87 98 00 00 00 48 8b 80 a0 00 00 00 48 
[  454.882146] RIP  [<ffffffff8115888e>] shm_fault+0x12/0x18
[  454.882146]  RSP <ffff880231ac3d20>
[  454.882146] CR2: 0000000000000010
[  454.952349] ---[ end trace 5786e416cc26ff98 ]---


A different sample, using the cdc-wdm as a standalone driver (and an
older version as you can see from the "cdc-wdm-176" names):

[  140.593274] cdc_wdm 6-4:1.8: cdc-wdm-176: USB WDM device
[  140.593559] cdc_wdm 6-4:1.19: cdc-wdm-175: USB WDM device
[  140.594005] cdc_wdm 6-4:1.20: cdc-wdm-174: USB WDM device
[  341.527338] cdc_wdm 6-4:1.8: nonzero urb status received: -EPIPE
[  341.527572] cdc_wdm 6-4:1.8: Error in flush path: -32
[  344.048731] general protection fault: 0000 [#1] SMP 
[  344.048850] CPU 0 
[  344.048890] Modules linked in: cdc_wdm(O) bridge iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack acpi_cpufreq mperf cpufreq_userspace cpufreq_stats cpufreq_conservative cpufreq_powersave xt_hl ip6t_LOG xt_multiport ip6table_filter ip6_tables iptable_filter ip_tables x_tables parport_pc ppdev lp parport rfcomm bnep binfmt_misc uinput microcode fuse nfsd nfs nfs_acl auth_rpcgss fscache lockd sunrpc kvm_intel kvm 8021q garp stp tun ext2 coretemp loop snd_hda_codec_conexant joydev snd_hda_intel snd_hda_codec snd_hwdep snd_pcm_oss snd_mixer_oss snd_pcm arc4 thinkpad_acpi nvram snd_page_alloc snd_seq_midi snd_seq_midi_event snd_rawmidi sierra(O) usbserial uvcvideo videodev iwlwifi snd_seq v4l2_compat_ioctl32 btusb usbnet bluetooth snd_seq_device media mii crc16 snd_timer mac80211 snd i915 cfg80211 psmouse iTCO_wdt ac battery serio_raw i2c_i801 evdev power_supply iTCO_vendor_support soundcore rfkill wmi drm_kms_helper drm i2c_algo_bit i2c_core video button processor ext3 mbcache jbd sha256_generic cryptd aes_x86_64 aes_generic cbc usb_storage uas dm_crypt dm_mod nbd sr_mod cdrom sd_mod crc_t10dif ahci libahci libata uhci_hcd thermal thermal_sys ehci_hcd scsi_mod e1000e usbcore usb_common [last unloaded: cdc_wdm]
[  344.051792] 
[  344.051827] Pid: 2796, comm: Xorg Tainted: G           O 3.2.0-2-amd64 #1 LENOVO 2776LEG/2776LEG
[  344.052001] RIP: 0010:[<ffffffffa01bdb3c>]  [<ffffffffa01bdb3c>] drm_clflush_pages+0x70/0xc8 [drm]
[  344.052183] RSP: 0018:ffff880230c91ca8  EFLAGS: 00010286
[  344.052277] RAX: d7d87f898ef74000 RBX: 6db6db6db6db6db7 RCX: 0000000000000000
[  344.052400] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff88022fc24820
[  344.052522] RBP: 0000160000000000 R08: ffff880230c91fd8 R09: ffff880230c91fd8
[  344.052583] R10: d7d87f898ef74000 R11: ffff880000000000 R12: 0000000000000000
[  344.052583] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000008
[  344.052583] FS:  00007f51d1740880(0000) GS:ffff88023bc00000(0000) knlGS:0000000000000000
[  344.052583] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  344.052583] CR2: 00007fe9942a5ad0 CR3: 0000000230ff2000 CR4: 00000000000006f0
[  344.052583] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  344.052583] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  344.052583] Process Xorg (pid: 2796, threadinfo ffff880230c90000, task ffff88022fb387b0)
[  344.052583] Stack:
[  344.052583]  000000000000009c ffff88023144e800 0000000000000001 ffffffffa026ddfa
[  344.052583]  ffff88023144e800 0000000000000000 00000000000002c3 ffff88023144e800
[  344.052583]  ffff880231c6e800 ffffffffa026e790 ffff88023144e800 ffff880231baa000
[  344.052583] Call Trace:
[  344.052583]  [<ffffffffa026ddfa>] ? i915_gem_object_set_to_cpu_domain+0xd3/0x114 [i915]
[  344.052583]  [<ffffffffa026e790>] ? i915_gem_object_unbind+0x94/0x18a [i915]
[  344.052583]  [<ffffffffa026e89d>] ? i915_gem_free_object_tail+0x17/0xdf [i915]
[  344.052583]  [<ffffffffa01bfd94>] ? drm_gem_handle_create+0xc0/0xc0 [drm]
[  344.052583]  [<ffffffff811aaade>] ? kref_put+0x3e/0x47
[  344.052583]  [<ffffffffa01bfaff>] ? drm_gem_object_unreference_unlocked+0x2b/0x3a [drm]
[  344.052583]  [<ffffffffa01bfcc9>] ? drm_gem_handle_delete+0x79/0x84 [drm]
[  344.052583]  [<ffffffffa01be61f>] ? drm_ioctl+0x289/0x35e [drm]
[  344.052583]  [<ffffffffa01c0072>] ? drm_gem_destroy+0x3a/0x3a [drm]
[  344.052583]  [<ffffffff811ae1dc>] ? timerqueue_del+0x53/0x63
[  344.052583]  [<ffffffff81061c96>] ? __remove_hrtimer+0x2a/0x82
[  344.052583]  [<ffffffff81106625>] ? do_vfs_ioctl+0x459/0x49a
[  344.052583]  [<ffffffff811066b1>] ? sys_ioctl+0x4b/0x72
[  344.052583]  [<ffffffff8134e252>] ? system_call_fastpath+0x16/0x1b
[  344.052583] Code: 4b 48 8b 04 d7 48 85 c0 74 3f 41 ff 80 44 e0 ff ff 48 01 e8 31 c9 48 c1 f8 03 48 0f af c3 48 c1 e0 0c 4c 01 d8 41 89 ca 49 01 c2 <41> 0f ae 3a 44 0f b7 15 de d8 4c e1 44 01 d1 81 f9 ff 0f 00 00 
[  344.052583] RIP  [<ffffffffa01bdb3c>] drm_clflush_pages+0x70/0xc8 [drm]
[  344.052583]  RSP <ffff880230c91ca8>
[  344.057099] ---[ end trace 9a872e6a30ac51c6 ]---




Bjørn
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux