On Sun, Jul 7, 2013 at 9:39 PM, Devin Heitmueller <dheitmueller@xxxxxxxxxxxxxx> wrote: > I finally dug out my Beagle 480 USB, so I will get that hooked up this > week, write a decoder to reassemble the video frames based on the USB > trace, and know once and for all whether the device is delivering > correct video or not. If the video being delivered by the device has > no corruption, then we're talking about some sort of memory > consistency or DMA issue (or perhaps some sort of problem with the USB > core populating the finished URBs before calling the completion > handler). If the video coming down the bus is corrupted, then we're > probably talking about some sort of timing problem with the URB > submission (combined with the FIFO on the chip poorly handling the > incorrect timing). So I hooked up the video and wrote a bit of Perl to parse the ISOC stream and render the underlying video frames. I can see definitively that the video returned from the device contains the corruption. This rules out any sort of DMA or memory related issue (proving that the data is not being mangled by the host on receipt). Now that I have the raw USB trace though including timing data, I started looking at the actual underlying ISOC traffic at the time of the corruption, and found something interesting: Despite having five URBs queued at all times with an interval of 1, there are cases where the URB isn't being sent. The corruption consistently follows one of these intervals where a URB was skipped. We're expecting the host controller to request to pull the buffer every 125us, and in instances where the corruption is exhibited immediately follow a 250us gap between URBs. See attached screenshot: http://devinheitmueller.com/isoc_loss.png Packet 27082 is the packet that contains the corruption. The previous URB was received exactly 250us prior (whereas it should have been only 125us). 349.594 - 349.344 = 250. I suspect the FIFO is overflowing on the chip as a result of the host controller not asking for the buffer when it's supposed to. It's worth mentioning that the "corrupt bytes" are actually also found several packets later in the correct place, suggesting the chip is probably employing some sort of circular buffer which is wrapping around. So should I be digging into the EHCI URB scheduling code? Any suggestions on where else I should be poking around would be very welcome. I'll be the first to admit that this isn't my particular area of expertise - so if I've made some stupid assumption about the expected behavior for the URB timing on the bus, don't hesitate to point that out. Devin -- Devin J. Heitmueller - Kernel Labs http://www.kernellabs.com -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html