Re: xhci driver fails and corrupt memory

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Sorry, forgot to CC to mail lists.
Also, I tried reproducing the fail on your for-usb-linus branch with
NEC controller, but it works fine now (with the described change to
libusb). Fresco Logic on the other hand still fails every couple of
times. Will check Ivy Bridge too.

On Tue, 26 Jun 2012 11:47:57 +0400
Igor Kuzmin <parafin@xxxxxxxxx> wrote:

> On Mon, 25 Jun 2012 13:26:45 -0700
> Sarah Sharp <sarah.a.sharp@xxxxxxxxxxxxxxx> wrote:
> 
> > On Fri, Jun 22, 2012 at 03:36:53PM +0400, Igor Kuzmin wrote:
> > > Hello!
> > > 
> > > I'm software engineer at XIMEA (www.ximea.com) responsible for Linux
> > > support for our products. We're using libusb-1.0 to handle our USB
> > > cameras, everything works very stable with USB2 controllers, not so
> > > good with XHCI (both with USB2 and USB3 devices).
> > 
> > Does your libusb driver submit very large transfers?  We ran into an
> > issue with libusb making assumptions about the short packet behavior
> > that just isn't true under xHCI.
> > 
> > libusb breaks large transfers into smaller buffers because older kernels
> > had a limit on how large each buffer could be.  Newer kernels now have a
> > limit on the total size of all mapped usbfs buffers.  So libusb would
> > break larger transfers into a couple of usbfs URB submissions, and usbfs
> > would assume that a short packet would stop the queue long enough for it
> > to cancel any URBs that were part of that transfer.  But that's not true
> > under xHCI, so your application would see a short transfer completion
> > followed by a canceled URB completion.
> > 
> > Basically, libusb just needs to be fixed to not break transfers into
> > smaller chunks.  Hans de Geode volunteered to work on this.
> 
> Yes, I stumbled into this issue, for the time being we patched libusb,
> so that MAX_BULK_BUFFER_LENGTH is 1MB (that's exactly how big is the
> transfers we submit, so no splitting). Without this change it just
> doesn't work at all (there are some short transfers). Good to hear that
> somebody is working on it, I submitted my patch to libusb's trac, but
> had no response yet.
> 
> > 
> > Also, you may need to increase the usbfs total buffer size by reloading
> > the usbcore:
> > 
> > sudo modprobe usbcore usbfs_memory_mb=1000
> > 
> > That will increase the usbfs total buffer limit to 1GB.
> 
> Is this connected to "No room on EP Ring" messages? I had to limit
> maximum number of submitted transfers to 3 (so 3 MB in total) to get
> rid of this message. Bigger values work though on kernels >=3.4, there
> has been done some dynamic expansion if I understand correctly.
> 
> > > xhci driver sometimes
> > > corrupt memory (probably DMA is to blame), some transfers fail with
> > > messages like these in kernel log:
> > > 
> > > ERROR Transfer event TRB DMA ptr not part of current TD
> > > WARN Successful completion on short TX
> > 
> > Were you running under a Fresco Logic host controller when you got those
> > messages?  We recently added a quirk for Fresco Logic for successful
> > completions on short packets.
> 
> I think the quirk was enabled, because I even tried adding different
> quirks myself, made no to little difference.
> 
> > 
> > > WARN: TRB error on endpoint
> > > WARN waiting for error on ep to be cleared
> > > 
> > > I've tried several controllers (Ivy Bridge, Fresco Logic, NEC) and
> > > different kernels (3.3.3, 3.4.3, 3.5-rc3, xhci git tree), problems
> > > remain. Last time I tried to enable XHCI verbose debug in kernel it
> > > hanged the machine after the transfer started. Syslog daemon though was
> > > able to store at least some of it. So my question is how better to
> > > proceed with bug reports? What information to provide, running which
> > > kernel (and controller maybe?..)?
> > 
> > Let's try with the NEC host controller, since that's a bit more stable
> > than the Fresco Logic host.
> > 
> > Can you try running with my for-usb-linus branch?  I just fixed a bug
> > related to successive stalls corrupting the software ring state.
> > 
> > Send me the dmesg with xHCI debugging turned off, and then try to use
> > netconsole to capture the hang with debugging turned on.
> > 
> > Sarah Sharp
> 
> OK, will do. Maybe usbmon output will be helpful too? Or even better we
> have USB3 analyzer here, though software seems to be Windows only, so
> I'm not sure about dump format.
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux