On Mon, 29 Jul 2013, James Stone wrote: > OK, having said that, I just got it to happen - listening to audacity > and just logging into Facebook (of all things!! Meh!) > > This is the contents of the trace file (as per instructions on bug #1191603) > > # tracer: irqsoff > # > # irqsoff latency trace v1.1.5 on 3.10.0-ver5 > # -------------------------------------------------------------------- > # latency: 2173 us, #4/4, CPU#0 | (M:desktop VP:0, KP:0, SP:0 HP:0 #P:4) > # ----------------- > # | task: apt-check-3628 (uid:1000 nice:19 policy:0 rt_prio:0) > # ----------------- > # => started at: perf_event_update_userpage > # => ended at: retint_careful > # > # > # _------=> CPU# > # / _-----=> irqs-off > # | / _----=> need-resched > # || / _---=> hardirq/softirq > # ||| / _--=> preempt-depth > # |||| / delay > # cmd pid ||||| time | caller > # \ / ||||| \ | / > apt-chec-3628 0d.h. 0us!: local_clock <-perf_event_update_userpage > apt-chec-3628 0dN.. 2173us : trace_hardirqs_on_thunk <-retint_careful > apt-chec-3628 0dN.. 2173us+: trace_hardirqs_on_caller <-retint_careful > apt-chec-3628 0dN.. 2177us : <stack trace> > => trace_hardirqs_on_thunk > > I will send the tail of the usbmon trace off-list. This irqsoff trace and the usbmon trace were taken at the same time, showing the same events, right? Here's the important part of the usbmon trace, right where the errors began: ffff8800b5660200 1564937888 C Zo:1:002:1 0:1:3850:0 8 0:0:96 0:96:96 0:192:80 0:272:96 0:368:80 720 > ffff8800b5660200 1564937895 S Zo:1:002:1 -115:1:3850 8 -18:0:80 -18:80:96 -18:176:80 -18:256:96 -18:352:80 704 = 00edba02 00371d06 00edba02 00371d06 005079fd 001b1003 005079fd 001b1003 ffff8800c9d10c00 1564945487 C Zi:1:002:1 0:8:3863:0 1 0:0:4 4 = 08830500 ffff8800c9d10c00 1564945495 S Zi:1:002:1 -115:8:3863 1 -18:0:4 4 < ffff8800c9d10c00 1564945499 E Zi:1:002:1 -18 0 (The timestamp is the second column, in microseconds. The last line is the first error, caused by the URB submission on the previous line, which was too late.) This shows that the errors occurred because a period longer than 7 ms passed with no URB completions (the interval between the first and third lines above). Everywhere else in the usbmon trace there were at least two URBs completing every ms, and there can be no doubt that the EHCI controller tried to issue at least one IRQ every ms. This suggests that the IRQ line was disabled or interrupts were disabled on all four CPUs for 7 ms. However, the irqsoff trace shows that the maximum latency was 2 ms. Something strange is going on, but I don't know what or how to find it. Steve, do you have any ideas about this? Or suggestions for other people to ask? I can try to ameliorate the situation. Although the 7-ms delay will inevitably cause an underrun, it doesn't have to cause errors the way it does now. I'll write a patch to handle this. It may take a few days. Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html