Re: xhci driver fails and corrupt memory

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jun 22, 2012 at 03:36:53PM +0400, Igor Kuzmin wrote:
> Hello!
> 
> I'm software engineer at XIMEA (www.ximea.com) responsible for Linux
> support for our products. We're using libusb-1.0 to handle our USB
> cameras, everything works very stable with USB2 controllers, not so
> good with XHCI (both with USB2 and USB3 devices).

Does your libusb driver submit very large transfers?  We ran into an
issue with libusb making assumptions about the short packet behavior
that just isn't true under xHCI.

libusb breaks large transfers into smaller buffers because older kernels
had a limit on how large each buffer could be.  Newer kernels now have a
limit on the total size of all mapped usbfs buffers.  So libusb would
break larger transfers into a couple of usbfs URB submissions, and usbfs
would assume that a short packet would stop the queue long enough for it
to cancel any URBs that were part of that transfer.  But that's not true
under xHCI, so your application would see a short transfer completion
followed by a canceled URB completion.

Basically, libusb just needs to be fixed to not break transfers into
smaller chunks.  Hans de Geode volunteered to work on this.

Also, you may need to increase the usbfs total buffer size by reloading
the usbcore:

sudo modprobe usbcore usbfs_memory_mb=1000

That will increase the usbfs total buffer limit to 1GB.

> xhci driver sometimes
> corrupt memory (probably DMA is to blame), some transfers fail with
> messages like these in kernel log:
> 
> ERROR Transfer event TRB DMA ptr not part of current TD
> WARN Successful completion on short TX

Were you running under a Fresco Logic host controller when you got those
messages?  We recently added a quirk for Fresco Logic for successful
completions on short packets.

> WARN: TRB error on endpoint
> WARN waiting for error on ep to be cleared
> 
> I've tried several controllers (Ivy Bridge, Fresco Logic, NEC) and
> different kernels (3.3.3, 3.4.3, 3.5-rc3, xhci git tree), problems
> remain. Last time I tried to enable XHCI verbose debug in kernel it
> hanged the machine after the transfer started. Syslog daemon though was
> able to store at least some of it. So my question is how better to
> proceed with bug reports? What information to provide, running which
> kernel (and controller maybe?..)?

Let's try with the NEC host controller, since that's a bit more stable
than the Fresco Logic host.

Can you try running with my for-usb-linus branch?  I just fixed a bug
related to successive stalls corrupting the software ring state.

Send me the dmesg with xHCI debugging turned off, and then try to use
netconsole to capture the hang with debugging turned on.

Sarah Sharp
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux