On Fri, Jun 22, 2012 at 03:36:53PM +0400, Igor Kuzmin wrote: > Hello! > > I'm software engineer at XIMEA (www.ximea.com) responsible for Linux > support for our products. We're using libusb-1.0 to handle our USB > cameras, everything works very stable with USB2 controllers, not so > good with XHCI (both with USB2 and USB3 devices). Does your libusb driver submit very large transfers? We ran into an issue with libusb making assumptions about the short packet behavior that just isn't true under xHCI. libusb breaks large transfers into smaller buffers because older kernels had a limit on how large each buffer could be. Newer kernels now have a limit on the total size of all mapped usbfs buffers. So libusb would break larger transfers into a couple of usbfs URB submissions, and usbfs would assume that a short packet would stop the queue long enough for it to cancel any URBs that were part of that transfer. But that's not true under xHCI, so your application would see a short transfer completion followed by a canceled URB completion. Basically, libusb just needs to be fixed to not break transfers into smaller chunks. Hans de Geode volunteered to work on this. Also, you may need to increase the usbfs total buffer size by reloading the usbcore: sudo modprobe usbcore usbfs_memory_mb=1000 That will increase the usbfs total buffer limit to 1GB. > xhci driver sometimes > corrupt memory (probably DMA is to blame), some transfers fail with > messages like these in kernel log: > > ERROR Transfer event TRB DMA ptr not part of current TD > WARN Successful completion on short TX Were you running under a Fresco Logic host controller when you got those messages? We recently added a quirk for Fresco Logic for successful completions on short packets. > WARN: TRB error on endpoint > WARN waiting for error on ep to be cleared > > I've tried several controllers (Ivy Bridge, Fresco Logic, NEC) and > different kernels (3.3.3, 3.4.3, 3.5-rc3, xhci git tree), problems > remain. Last time I tried to enable XHCI verbose debug in kernel it > hanged the machine after the transfer started. Syslog daemon though was > able to store at least some of it. So my question is how better to > proceed with bug reports? What information to provide, running which > kernel (and controller maybe?..)? Let's try with the NEC host controller, since that's a bit more stable than the Fresco Logic host. Can you try running with my for-usb-linus branch? I just fixed a bug related to successive stalls corrupting the software ring state. Send me the dmesg with xHCI debugging turned off, and then try to use netconsole to capture the hang with debugging turned on. Sarah Sharp -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html