On Tue, 9 Jul 2013, Devin Heitmueller wrote: > So I hooked up the video and wrote a bit of Perl to parse the ISOC > stream and render the underlying video frames. I can see definitively > that the video returned from the device contains the corruption. This > rules out any sort of DMA or memory related issue (proving that the > data is not being mangled by the host on receipt). > > Now that I have the raw USB trace though including timing data, I > started looking at the actual underlying ISOC traffic at the time of > the corruption, and found something interesting: Despite having five > URBs queued at all times with an interval of 1, there are cases where > the URB isn't being sent. The corruption consistently follows one of > these intervals where a URB was skipped. We're expecting the host > controller to request to pull the buffer every 125us, and in instances > where the corruption is exhibited immediately follow a 250us gap > between URBs. > > See attached screenshot: > > http://devinheitmueller.com/isoc_loss.png > > Packet 27082 is the packet that contains the corruption. The previous > URB was received exactly 250us prior (whereas it should have been only > 125us). 349.594 - 349.344 = 250. > > I suspect the FIFO is overflowing on the chip as a result of the host > controller not asking for the buffer when it's supposed to. It's > worth mentioning that the "corrupt bytes" are actually also found > several packets later in the correct place, suggesting the chip is > probably employing some sort of circular buffer which is wrapping > around. > > So should I be digging into the EHCI URB scheduling code? Any > suggestions on where else I should be poking around would be very > welcome. Digging into the scheduling code probably won't help much. However you could try collecting a usbmon trace (see Documentation/usb/usbmon.txt). This would clearly show the timing of URB submissions and completions. > I'll be the first to admit that this isn't my particular area of > expertise - so if I've made some stupid assumption about the expected > behavior for the URB timing on the bus, don't hesitate to point that > out. The transfers should occur regularly at 1-microframe intervals. If they don't then something is wrong somewhere. But I think the problem is more likely to lie in the upper-level driver than in ehci-hcd. You're using the em28xx driver? At first glance, there is one obvious bug in that driver (probably not at all related to your problem, though). The em28xx_irq_callback() routine should not set urb->status. I bet the problem is related to the usage of the URB_ISO_ASAP flag. em28xx_alloc_urbs() sets URB_ISO_ASAP in urb->transfer_flags, and the value never gets cleared. In fact, that flag bit is supposed to be set only in the first URB of a stream, not in the following URBs. (The same mistake is present for the URBs in the audio stream.) Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html