On Mon, Jul 8, 2013 at 9:39 AM, Devin Heitmueller <dheitmueller@xxxxxxxxxxxxxx> wrote: > Hi Alan, > > Thanks for taking the time to provide feedback. I'm just noticing now > that I left off the subject line, which all the more reason makes me > thankful that you bothered to read an email with as uninteresting a > subject line as is possible. :-) > > On Tue, Jul 2, 2013 at 11:21 AM, Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> wrote: >> That's weird. But it might not be the scheduler so much; it could be >> related more to the total CPU load. > > Sorry, I didn't mean to suggest the scheduler itself was at fault - > just that the high context switching may be changing the timing in a > way that exposes a missing spinlock in usb core or increases the > likelihood that a call that is sleepable is taking the path of the > context switch. > >>> The problem has been seen on both the stock EHCI driver (on x86) as >>> well as the musb driver used on the TI Davinci platform (ARM). The >>> transfer buffer itself is being allocated using usb_alloc_coherent(), >>> and I've seen it when allocating with vmalloc() as well. >> >> Do you mean kmalloc()? Memory allocated with vmalloc() is generally >> not suitable for DMA mapping. > > Yeah, typo. Sorry. > >>> This feels like some sort of DMA or cache related issue, since the >>> behavior of the URB completion handler itself appears completely >>> consistent regardless of the system load. I'm seeing the issue on >>> 3.10-rc6 all the way back to 2.6.31 (the earliest I can go on my >>> Ubuntu box given some udev related dependencies). >>> >>> I've done plenty of work on USB drivers under Linux over the years, >>> but haven't dug too much into the USB core. Anybody who has any >>> suggestions on how to debug such a timing problem, such suggestions >>> would be very welcomed. >> >> This is an interesting problem, but I don't think you'll get much >> insight from looking at the USB side of things. You could try asking >> the people in charge of the DMA- and cache-related parts of the kernel. > > I finally dug out my Beagle 480 USB, so I will get that hooked up this > week, write a decoder to reassemble the video frames based on the USB > trace, and know once and for all whether the device is delivering > correct video or not. If the video being delivered by the device has > no corruption, then we're talking about some sort of memory > consistency or DMA issue (or perhaps some sort of problem with the USB > core populating the finished URBs before calling the completion > handler). If the video coming down the bus is corrupted, then we're > probably talking about some sort of timing problem with the URB > submission (combined with the FIFO on the chip poorly handling the > incorrect timing). One suggestion: maybe it is better to verify if there is such problem on x86-ehci first because x86 is a memory coherent arch, which means there shouldn't have the DMA issue on x86 generally. Thanks, -- Ming Lei -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html