Re: Did USB slowly kill/hang my kernel? (3.12.7)

Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> · Sat, 15 Feb 2014 17:45:54 -0500 (EST)

On Sat, 15 Feb 2014, Marc MERLIN wrote:

> I found that my server at home, after months of runtime, hung and found
> the following on the serial console. Interestingly, the software watchdog wasn't
> even able to reboot the system it seems, it needed a power cycle.
> 
> Is there other info I can provide/different kernel options I should use?

You can try turning on CONFIG_USB_DEBUG.  It may not help much, though.

How reproducible is this failure?

> [1836102.933294] CPU: 1 PID: 426 Comm: khubd Tainted: G        W    3.12.7-amd64-i915-preempt-20131107 #1
> [1836102.963481] Hardware name: System manufacturer System Product Name/P8H67-M PRO, BIOS 3806 08/20/2012
> [1836102.993665]  00000000000000f5 ffff88021f287bd8 ffffffff815e1e2e 0000000000000006
> [1836103.018828]  ffff88021f287c28 ffff88021f287c18 ffffffff8104cf14 0000000000000000
> [1836103.043960]  ffffffff810c244f ffff8802148e3c00 0000000000000000 ffff88021f287d40
> [1836103.069099] Call Trace:
> [1836103.079178]  <NMI>  [<ffffffff815e1e2e>] dump_stack+0x4f/0x84
> [1836103.099185]  [<ffffffff8104cf14>] warn_slowpath_common+0x81/0x9b
> [1836103.119948]  [<ffffffff810c244f>] ? watchdog_overflow_callback+0x9c/0xa7
> [1836103.142762]  [<ffffffff8104cfd1>] warn_slowpath_fmt+0x46/0x48
> [1836103.162706]  [<ffffffff810c244f>] watchdog_overflow_callback+0x9c/0xa7
> [1836103.184957]  [<ffffffff810f10ce>] __perf_event_overflow+0x134/0x1bd
> [1836103.206470]  [<ffffffff810210dc>] ? x86_perf_event_set_period+0x103/0x10f
> [1836103.229493]  [<ffffffff810f16d6>] perf_event_overflow+0x14/0x16
> [1836103.249897]  [<ffffffff81027100>] intel_pmu_handle_irq+0x28c/0x311
> [1836103.271063]  [<ffffffff815e9534>] perf_event_nmi_handler+0x2b/0x48
> [1836103.292198]  [<ffffffff815e8ddf>] nmi_handle.isra.3+0x6f/0x16e
> [1836103.312281]  [<ffffffff815e958e>] ? perf_ibs_nmi_handler+0x3d/0x3d
> [1836103.333380]  [<ffffffff815e8f80>] do_nmi+0xa2/0x2ce
> [1836103.350571]  [<ffffffff815e85b1>] end_repeat_nmi+0x1e/0x2e
> [1836103.369600]  [<ffffffffa000fa97>] ? usb_hcd_flush_endpoint+0x38/0xf0 [usbcore]
> [1836103.393814]  [<ffffffffa000fa97>] ? usb_hcd_flush_endpoint+0x38/0xf0 [usbcore]
> [1836103.417981]  [<ffffffffa000fa97>] ? usb_hcd_flush_endpoint+0x38/0xf0 [usbcore]
> [1836103.442121]  <<EOE>>  [<ffffffffa0011de2>] usb_disable_endpoint+0x5c/0x74 [usbcore]

Something is stuck in usb_disable_endpoint.  But from the information 
shown, we have no way of knowing which driver is involved.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html