On Tue, 2013-07-30 at 11:42 -0400, Alan Stern wrote: > On Mon, 29 Jul 2013, James Stone wrote: Just an FYI, it is usually better to email my rostedt@xxxxxxxxxxx account. I don't always read my redhat email. That's reserved for bugzillas assigned to me ;-) > > > OK, having said that, I just got it to happen - listening to audacity > > and just logging into Facebook (of all things!! Meh!) > > > > This is the contents of the trace file (as per instructions on bug #1191603) > > > > # tracer: irqsoff > > # > > # irqsoff latency trace v1.1.5 on 3.10.0-ver5 > > # -------------------------------------------------------------------- > > # latency: 2173 us, #4/4, CPU#0 | (M:desktop VP:0, KP:0, SP:0 HP:0 #P:4) > > # ----------------- > > # | task: apt-check-3628 (uid:1000 nice:19 policy:0 rt_prio:0) > > # ----------------- > > # => started at: perf_event_update_userpage > > # => ended at: retint_careful > > # > > # > > # _------=> CPU# > > # / _-----=> irqs-off > > # | / _----=> need-resched > > # || / _---=> hardirq/softirq > > # ||| / _--=> preempt-depth > > # |||| / delay > > # cmd pid ||||| time | caller > > # \ / ||||| \ | / > > apt-chec-3628 0d.h. 0us!: local_clock <-perf_event_update_userpage perf_event_update_userpage should not be taking 2ms to complete. > > apt-chec-3628 0dN.. 2173us : trace_hardirqs_on_thunk <-retint_careful > > apt-chec-3628 0dN.. 2173us+: trace_hardirqs_on_caller <-retint_careful > > apt-chec-3628 0dN.. 2177us : <stack trace> > > => trace_hardirqs_on_thunk > > > > I will send the tail of the usbmon trace off-list. > > This irqsoff trace and the usbmon trace were taken at the same time, > showing the same events, right? > > Here's the important part of the usbmon trace, right where the errors > began: > > ffff8800b5660200 1564937888 C Zo:1:002:1 0:1:3850:0 8 0:0:96 0:96:96 0:192:80 0:272:96 0:368:80 720 > > ffff8800b5660200 1564937895 S Zo:1:002:1 -115:1:3850 8 -18:0:80 -18:80:96 -18:176:80 -18:256:96 -18:352:80 704 = 00edba02 00371d06 00edba02 00371d06 005079fd 001b1003 005079fd 001b1003 > ffff8800c9d10c00 1564945487 C Zi:1:002:1 0:8:3863:0 1 0:0:4 4 = 08830500 > ffff8800c9d10c00 1564945495 S Zi:1:002:1 -115:8:3863 1 -18:0:4 4 < > ffff8800c9d10c00 1564945499 E Zi:1:002:1 -18 0 > > (The timestamp is the second column, in microseconds. The last line is > the first error, caused by the URB submission on the previous line, > which was too late.) > > This shows that the errors occurred because a period longer than 7 ms > passed with no URB completions (the interval between the first and > third lines above). Everywhere else in the usbmon trace there were at > least two URBs completing every ms, and there can be no doubt that the > EHCI controller tried to issue at least one IRQ every ms. > > This suggests that the IRQ line was disabled or interrupts were > disabled on all four CPUs for 7 ms. However, the irqsoff trace shows > that the maximum latency was 2 ms. Something strange is going on, but > I don't know what or how to find it. > > Steve, do you have any ideas about this? Or suggestions for other > people to ask? Is there SMIs on this box? Those will not be detected by the latency tracers directly. But we do have other tools to detect them. -- Steve > > I can try to ameliorate the situation. Although the 7-ms delay will > inevitably cause an underrun, it doesn't have to cause errors the way > it does now. I'll write a patch to handle this. It may take a few > days. > > Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html