Re: Fwd: PROBLEM: Permanent kernel panic in USB hub driver - 3.5.0-22

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,
Today I have "successfully" reproduced the bug on 3.7.4 kernel. The symptoms looks to be exactly the same - null pointer dereference in interrupt context, somewhere in xhci IRQ handler. Unfortunately this time I have no core dumps - for some reason crash kernel is not started after the crash, however dmesg shows that the space is preallocated for it. However, I think the root cause is the same as in previous cores, so it is still would be worth to analyze them. Here is a new picture of the panic:
https://dl.dropbox.com/u/8276110/3.7.4%20panic.jpg

$ cat /proc/version
Linux version 3.7.4 (vagran@AST-mobile) (gcc version 4.7.2 (Ubuntu/Linaro 4.7.2-2ubuntu1) ) #1 SMP Sat Jan 26 12:24:36 EET 2013

The warning followed by bus malfunction also still persists:
Jan 26 13:03:42 AST-mobile kernel: [ 231.054897] usb 3-2: new low-speed USB device number 14 using xhci_hcd Jan 26 13:03:47 AST-mobile kernel: [ 236.067310] usb 3-2: device descriptor read/all, error -110 Jan 26 13:03:47 AST-mobile kernel: [ 236.179193] usb 3-2: new low-speed USB device number 15 using xhci_hcd Jan 26 13:03:52 AST-mobile kernel: [ 241.175598] xhci_hcd 0000:00:14.0: Timeout while waiting for address device command Jan 26 13:03:53 AST-mobile kernel: [ 241.379407] ------------[ cut here ]------------ Jan 26 13:03:53 AST-mobile kernel: [ 241.379426] WARNING: at drivers/usb/host/xhci.c:3657 xhci_address_device+0x2db/0x300()
Jan 26 13:03:53 AST-mobile kernel: [  241.379429] Hardware name: 24382LU
Jan 26 13:03:53 AST-mobile kernel: [ 241.379431] Modules linked in: ftdi_sio usbserial bnep rfcomm parport_pc ppdev binfmt_misc ext2 arc4 iwldvm mac80211 coretemp snd_hda_codec_realtek kvm_intel kvm snd_hda_intel snd_hda_codec iwlwifi snd_hwdep snd_pcm thinkpad_acpi ghash_clmulni_intel aesni_intel ablk_helper btusb snd_seq_midi cryptd snd_rawmidi bluetooth lrw snd_seq_midi_event aes_x86_64 snd_seq xts cfg80211 gf128mul psmouse snd_timer snd_seq_device microcode serio_raw joydev snd nvram tpm_tis lpc_ich soundcore mei snd_page_alloc mac_hid usbmon lp parport nouveau hid_generic i915 sdhci_pci sdhci usbhid hid ttm firewire_ohci drm_kms_helper firewire_core e1000e drm crc_itu_t i2c_algo_bit mxm_wmi wmi video Jan 26 13:03:53 AST-mobile kernel: [ 241.379505] Pid: 57, comm: khubd Not tainted 3.7.4 #1
Jan 26 13:03:53 AST-mobile kernel: [  241.379507] Call Trace:
Jan 26 13:03:53 AST-mobile kernel: [ 241.379518] [<ffffffff810575bf>] warn_slowpath_common+0x7f/0xc0 Jan 26 13:03:53 AST-mobile kernel: [ 241.379523] [<ffffffff8105761a>] warn_slowpath_null+0x1a/0x20 Jan 26 13:03:53 AST-mobile kernel: [ 241.379528] [<ffffffff814ea5cb>] xhci_address_device+0x2db/0x300 Jan 26 13:03:53 AST-mobile kernel: [ 241.379535] [<ffffffff814bd3c9>] hub_port_init+0x229/0x9f0 Jan 26 13:03:53 AST-mobile kernel: [ 241.379543] [<ffffffff8143763d>] ? pm_runtime_set_autosuspend_delay+0x5d/0x80 Jan 26 13:03:53 AST-mobile kernel: [ 241.379551] [<ffffffff81044b49>] ? default_spin_lock_flags+0x9/0x10 Jan 26 13:03:53 AST-mobile kernel: [ 241.379556] [<ffffffff814c051f>] hub_thread+0x54f/0x1380 Jan 26 13:03:53 AST-mobile kernel: [ 241.379563] [<ffffffff8107c910>] ? finish_wait+0x80/0x80 Jan 26 13:03:53 AST-mobile kernel: [ 241.379568] [<ffffffff814bffd0>] ? usb_remote_wakeup+0x40/0x40 Jan 26 13:03:53 AST-mobile kernel: [ 241.379572] [<ffffffff8107bfc0>] kthread+0xc0/0xd0 Jan 26 13:03:53 AST-mobile kernel: [ 241.379577] [<ffffffff8107bf00>] ? kthread_create_on_node+0x130/0x130 Jan 26 13:03:53 AST-mobile kernel: [ 241.379584] [<ffffffff816a102c>] ret_from_fork+0x7c/0xb0 Jan 26 13:03:53 AST-mobile kernel: [ 241.379588] [<ffffffff8107bf00>] ? kthread_create_on_node+0x130/0x130 Jan 26 13:03:53 AST-mobile kernel: [ 241.379591] ---[ end trace b3ec4f772c11ce62 ]--- Jan 26 13:03:53 AST-mobile kernel: [ 241.379598] xhci_hcd 0000:00:14.0: Virt dev invalid for slot_id 0xe! Jan 26 13:03:53 AST-mobile kernel: [ 241.583272] usb 3-2: device not accepting address 15, error -22 Jan 26 13:03:53 AST-mobile kernel: [ 241.583296] xHCI xhci_free_dev called with unaddressed device Jan 26 13:03:58 AST-mobile kernel: [ 246.579674] xhci_hcd 0000:00:14.0: Timeout while waiting for a slot Jan 26 13:03:58 AST-mobile kernel: [ 246.579692] hub 3-0:1.0: couldn't allocate port 2 usb_device Jan 26 13:04:18 AST-mobile kernel: [ 266.681173] xhci_hcd 0000:00:14.0: Timeout while waiting for a slot Jan 26 13:04:18 AST-mobile kernel: [ 266.681191] hub 3-0:1.0: couldn't allocate port 2 usb_device

Jan 26 13:05:19 AST-mobile kernel: [ 328.100883] xhci_hcd 0000:00:14.0: Timeout while waiting for a slot Jan 26 13:05:19 AST-mobile kernel: [ 328.100902] hub 3-0:1.0: couldn't allocate port 2 usb_device Jan 26 13:05:32 AST-mobile kernel: [ 341.027570] xhci_hcd 0000:00:14.0: Timeout while waiting for a slot Jan 26 13:05:32 AST-mobile kernel: [ 341.027590] hub 3-0:1.0: couldn't allocate port 2 usb_device

Also I found another case of the bus malfunction - some allocation failures are reported. As with the previous problem it continues even when normal device is plugged in after the bad one:
Jan 26 13:09:38 AST-mobile kernel: [ 569.791564] usb 3-2: new low-speed USB device number 31 using xhci_hcd Jan 26 13:09:43 AST-mobile kernel: [ 574.808971] usb 3-2: device descriptor read/all, error -110 Jan 26 13:09:44 AST-mobile kernel: [ 574.920977] usb 3-2: new low-speed USB device number 32 using xhci_hcd Jan 26 13:09:49 AST-mobile kernel: [ 579.935365] usb 3-2: device descriptor read/8, error -110 Jan 26 13:09:49 AST-mobile kernel: [ 580.056632] usb 3-2: device descriptor read/8, error -71 Jan 26 13:09:49 AST-mobile kernel: [ 580.159208] xhci_hcd 0000:00:14.0: Bad Slot ID 6 Jan 26 13:09:49 AST-mobile kernel: [ 580.159216] xhci_hcd 0000:00:14.0: Could not allocate xHCI USB device data structures Jan 26 13:09:49 AST-mobile kernel: [ 580.159229] hub 3-0:1.0: couldn't allocate port 2 usb_device

Jan 26 13:10:46 AST-mobile kernel: [ 637.621728] xhci_hcd 0000:00:14.0: Bad Slot ID 7 Jan 26 13:10:46 AST-mobile kernel: [ 637.621736] xhci_hcd 0000:00:14.0: Could not allocate xHCI USB device data structures Jan 26 13:10:46 AST-mobile kernel: [ 637.621748] hub 3-0:1.0: couldn't allocate port 2 usb_device

Jan 26 13:11:10 AST-mobile kernel: [ 661.544389] xhci_hcd 0000:00:14.0: Bad Slot ID 8 Jan 26 13:11:10 AST-mobile kernel: [ 661.544392] xhci_hcd 0000:00:14.0: Could not allocate xHCI USB device data structures Jan 26 13:11:10 AST-mobile kernel: [ 661.544400] hub 3-0:1.0: couldn't allocate port 2 usb_device
This time scenario is a bit different (actually previous crashes used the same but I have previously described the very first one, when I noticed the problem) - host retrieves first 8 bytes of device descriptor, and requests full descriptor after that. The device accepts the request, returns the first 8 bytes (the device has 8 bytes maximal packet size) and after that continues to send NAKs for all the next IN tokens from the host (see https://dl.dropbox.com/u/8276110/transaction_analysis.txt for dump). Probably the problem occurs for all uncompleted transactions.

Also I have tried to plug device into ehci port, but was unable to recreate the problem with it (nor crash, nor any suspicious messages in dmesg). Possibly (I'm not sure, because sometimes I need many plug/unplug retries to recreate the crash) the problem is specific to xhci.

Best regards,
Artyom.

On 01/26/2013 02:57 AM, Sarah Sharp wrote:
Can you please retry with the latest 3.7 stable kernel?  The 3.5 kernel
didn't have support for command cancellation, which is why the Set
Address command is hanging there.  The hung command sits on the command
ring, which means any following command doesn't get run.  That's why you
get timeouts when you plug in a new device: the slot allocation command
for that new device never gets executed, because the xHCI host is still
stuck on the previous command.

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux