Re: PROBLEM: Permanent kernel panic in USB hub driver - 3.5.0-22

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jan 25, 2013 at 11:54:06PM +0200, Artemy Lebedev wrote:
> Hi,
> I'm developing USB device with software implementation of USB
> protocol stack (even link level). It is not yet fully completed so
> it behaves wrong from host point of view. However now it is a good
> stress testing tool for the OS which discovered a couple of
> problems. I understand that the use case is quite unusual but still
> the problem is obvious. I have a permanent scenario for making a
> kernel panic.
> 
> So the scenario is quite simple - somewhere in the middle of USB
> control transaction the device stops to answer host requests (just
> silence) or in another scenario the device continuosly responds with
> NAKs to IN tokens. I have attached the file which contains some
> system information as well as description of the signaling on USB
> bus. There is an oscillogram capture in CSV format (volts on D- and
> D+ lines) changed in time (signal.csv.zip). For your convenience, I
> have parsed it by my analyzing script, see its output in
> signal_analysis.txt. Here it is seen that host sends "Get device
> descriptor" request, the device acknowledges the DATA0 packet by ACK
> in frame 3 and responds by NAK in frame 5 for IN token of status
> stage in frame 4. All the following IN tokens are ignored and host
> retries the whole transaction from the beginning while the device
> still keep silence.
> 
> This behavior leads to two bad effects on host. The first one is the
> following message in dmesg:
> 
> Jan 23 20:39:56 AST-mobile kernel: [ 355.978595] usb 3-2: new
> low-speed USB device number 8 using xhci_hcd
> Jan 23 20:40:01 AST-mobile kernel: [ 360.977493] xhci_hcd
> 0000:00:14.0: Timeout while waiting for address device command
> Jan 23 20:40:01 AST-mobile kernel: [ 361.181444] ------------[ cut
> here ]------------
> Jan 23 20:40:01 AST-mobile kernel: [ 361.181461] WARNING: at
> /build/buildd/linux-3.5.0/drivers/usb/host/xhci.c:3646
> xhci_address_device+0x2db/0x300()
> Jan 23 20:40:01 AST-mobile kernel: [ 361.181464] Hardware name: 24382LU
> Jan 23 20:40:01 AST-mobile kernel: [ 361.181466] Modules linked in:
> btrfs zlib_deflate libcrc32c ufs qnx4 hfsplus hfs minix ntfs msdos
> jfs xfs reiserfs bnep rfcomm parport_pc ppdev binfmt_misc ext2
> coretemp kvm_intel kvm ghash_clmulni_intel aesni_intel cryptd arc4
> aes_x86_64 snd_hda_codec_realtek microcode joydev uvcvideo
> videobuf2_core videodev thinkpad_acpi videobuf2_vmalloc iwlwifi
> lpc_ich videobuf2_memops psmouse mac80211 ftdi_sio serio_raw
> usbserial btusb snd_hda_intel snd_seq_midi snd_hda_codec bluetooth
> cfg80211 snd_hwdep snd_rawmidi snd_pcm snd_seq_midi_event snd_seq
> snd_seq_device snd_timer snd_page_alloc snd soundcore mei nvram
> tpm_tis mac_hid usbmon lp parport hid_generic usbhid hid i915
> nouveau firewire_ohci ttm drm_kms_helper firewire_core crc_itu_t drm
> i2c_algo_bit mxm_wmi sdhci_pci sdhci e1000e wmi video
> Jan 23 20:40:01 AST-mobile kernel: [ 361.181552] Pid: 48, comm:
> khubd Not tainted 3.5.0-22-generic #34-Ubuntu
> Jan 23 20:40:01 AST-mobile kernel: [ 361.181555] Call Trace:
> Jan 23 20:40:01 AST-mobile kernel: [ 361.181567]
> [<ffffffff81051c1f>] warn_slowpath_common+0x7f/0xc0
> Jan 23 20:40:01 AST-mobile kernel: [ 361.181572]
> [<ffffffff81051c7a>] warn_slowpath_null+0x1a/0x20
> Jan 23 20:40:01 AST-mobile kernel: [ 361.181578]
> [<ffffffff814dab8b>] xhci_address_device+0x2db/0x300
> Jan 23 20:40:01 AST-mobile kernel: [ 361.181585]
> [<ffffffff814ae599>] hub_port_init+0x229/0x9f0
> Jan 23 20:40:01 AST-mobile kernel: [ 361.181594]
> [<ffffffff8142ba5d>] ? pm_runtime_set_autosuspend_delay+0x5d/0x80
> Jan 23 20:40:01 AST-mobile kernel: [ 361.181600]
> [<ffffffff8103fa39>] ? default_spin_lock_flags+0x9/0x10
> Jan 23 20:40:01 AST-mobile kernel: [ 361.181606]
> [<ffffffff814b1e03>] hub_port_connect_change+0x283/0x9c0
> Jan 23 20:40:01 AST-mobile kernel: [ 361.181611]
> [<ffffffff814b2943>] hub_events+0x403/0x690
> Jan 23 20:40:01 AST-mobile kernel: [ 361.181619]
> [<ffffffff8168244f>] ? __schedule+0x3cf/0x7c0
> Jan 23 20:40:01 AST-mobile kernel: [ 361.181624]
> [<ffffffff814b2c05>] hub_thread+0x35/0x1a0
> Jan 23 20:40:01 AST-mobile kernel: [ 361.181632]
> [<ffffffff810768b0>] ? finish_wait+0x80/0x80
> Jan 23 20:40:01 AST-mobile kernel: [ 361.181637]
> [<ffffffff814b2bd0>] ? hub_events+0x690/0x690
> Jan 23 20:40:01 AST-mobile kernel: [ 361.181642]
> [<ffffffff81075f13>] kthread+0x93/0xa0
> Jan 23 20:40:01 AST-mobile kernel: [ 361.181648]
> [<ffffffff8168d024>] kernel_thread_helper+0x4/0x10
> Jan 23 20:40:01 AST-mobile kernel: [ 361.181653]
> [<ffffffff81075e80>] ? kthread_freezable_should_stop+0x70/0x70
> Jan 23 20:40:01 AST-mobile kernel: [ 361.181657]
> [<ffffffff8168d020>] ? gs_change+0x13/0x13
> Jan 23 20:40:01 AST-mobile kernel: [ 361.181660] ---[ end trace
> f87cad0b99ce3880 ]---
> Jan 23 20:40:01 AST-mobile kernel: [ 361.181665] xhci_hcd
> 0000:00:14.0: Virt dev invalid for slot_id 0x7!
> Jan 23 20:40:01 AST-mobile kernel: [ 361.385433] usb 3-2: device not
> accepting address 8, error -22
> Jan 23 20:40:01 AST-mobile kernel: [ 361.385456] xHCI xhci_free_dev
> called with unaddressed device
> Jan 23 20:40:06 AST-mobile kernel: [ 366.384344] xhci_hcd
> 0000:00:14.0: Timeout while waiting for a slot
> Jan 23 20:40:06 AST-mobile kernel: [ 366.384375] hub 3-0:1.0:
> couldn't allocate port 2 usb_device
> 
> I did not pay attention to kernel warnings about timeouts since they
> are expected, but the backtrace indicates that something bad
> occurred in the kernel (LKM crash?).

If you're talking about the backtrace above, that's this WARN_ON:

        virt_dev = xhci->devs[udev->slot_id];

	if (WARN_ON(!virt_dev)) {		<---

> After that whole the bus
> becomes unusable. Even if the normal device is plugged into the
> port, it is not handled and the following messages continuously
> repeating:
> Jan 23 20:40:06 AST-mobile kernel: [ 366.384344] xhci_hcd
> 0000:00:14.0: Timeout while waiting for a slot
> Jan 23 20:40:06 AST-mobile kernel: [ 366.384375] hub 3-0:1.0:
> couldn't allocate port 2 usb_device
> This ends only after the bus controller is powered off and on (e.g.
> sleep the system and wake up again).
> 
> The second problem is even more interesting, it's a kernel panic.
> Unfortunately I did find how to get the panic messages from the
> kernel persistent ring buffer (if there is such) after reboot. So I
> have taken a photo, available here:
> https://dl.dropbox.com/u/8276110/2013-01-25%2022.49.16.jpg

That's missing the beginning of the screen which says what the panic
is but I can see xhci_irq up in the stack trace and RIP points to
xhci_stream_id_to_ring so something going funny in the interrupt
handler.

> And, of course, the sweets - two kernel core dumps (possibly the
> same, but just for a case created them twice):
> https://dl.dropbox.com/u/8276110/vmcore.gz
> https://dl.dropbox.com/u/8276110/vmcore2.gz
> 
> So, that's all I have. See attached file and cores. Also if there
> will be a patch I can test it with my setup (required firmware
> version of my device is specially tagged as "Linux exploit" in my
> repository :).
> 
> P.S. Sorry, probably I have violated all rules of writing to mail
> list and bug reporting but I did not managed to find clear
> instructions how it should be done. Bugzilla seems to be only for
> 2.6 version. Where to put big attachments (like core dumps) is also
> unclear. So this procedure for Linux kernel is toо unobvious for
> ordinary system developer.

I've added linux-usb@ to Cc.

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux