Re: xhci driver fails and corrupt memory

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



OK, so, with for-usb-linus kernel current results with different
controllers are:
NEC - works
Fresco Logic - fails with "ERROR Transfer event TRB DMA ptr not part of
current TD"
Ivy Bridge - works, but there's another issue:
usb 4-2: new SuperSpeed USB device number 3 using xhci_hcd
usb 4-2: skipped 1 descriptor after endpoint
usb 4-2: skipped 1 descriptor after endpoint
usb 4-2: skipped 1 descriptor after endpoint
usb 4-2: skipped 1 descriptor after endpoint
usb 4-2: default language 0x0409
usb 4-2: udev 3, busnum 4, minor = 386
usb 4-2: New USB device found, idVendor=20f7, idProduct=3001
usb 4-2: New USB device strings: Mfr=1, Product=2, SerialNumber=0
usb 4-2: Product: XIMEA
usb 4-2: Manufacturer: XIMEA
usb 4-2: usb_probe_device
usb 4-2: configuration #1 chosen from 1 choice
usb 4-2: Successful Endpoint Configure command
usb 4-2: Successful evaluate context command
usb 4-2: Set SEL for device-initiated U1 failed.
usb 4-2: Successful evaluate context command
usb 4-2: adding 4-2:1.0 (config #1, interface 0)
hub 4-0:1.0: state 7 ports 4 chg 0000 evt 0004
hub 4-0:1.0: warm reset change on port 2
usb 4-2: Successful evaluate context command
usb 4-2: Successful evaluate context command
usb 4-2: xiSample timed out on ep0out len=0/0
usb 4-2: Enable of device-initiated U1 failed.
usb 4-2: Successful evaluate context command
usb 4-2: Successful evaluate context command
usb 4-2: Successful evaluate context command
usb 4-2: Successful evaluate context command
usb 4-2: Successful evaluate context command
usb 4-2: Successful evaluate context command
usb 4-2: xiSample timed out on ep0out len=0/0
usb 4-2: Enable of device-initiated U1 failed.
usb 4-2: Successful evaluate context command
xhci_hcd 0000:00:14.0: WARN Event TRB for slot 2 ep 2 with no TDs
queued?

The problem represents itself as hangs for a second or two during
certain operations. I haven't seen it before, with other kernels.

On Tue, 26 Jun 2012 12:51:51 +0400
Igor Kuzmin <parafin@xxxxxxxxx> wrote:

> Sorry, forgot to CC to mail lists.
> Also, I tried reproducing the fail on your for-usb-linus branch with
> NEC controller, but it works fine now (with the described change to
> libusb). Fresco Logic on the other hand still fails every couple of
> times. Will check Ivy Bridge too.
> 
> On Tue, 26 Jun 2012 11:47:57 +0400
> Igor Kuzmin <parafin@xxxxxxxxx> wrote:
> 
> > On Mon, 25 Jun 2012 13:26:45 -0700
> > Sarah Sharp <sarah.a.sharp@xxxxxxxxxxxxxxx> wrote:
> > 
> > > On Fri, Jun 22, 2012 at 03:36:53PM +0400, Igor Kuzmin wrote:
> > > > Hello!
> > > > 
> > > > I'm software engineer at XIMEA (www.ximea.com) responsible for Linux
> > > > support for our products. We're using libusb-1.0 to handle our USB
> > > > cameras, everything works very stable with USB2 controllers, not so
> > > > good with XHCI (both with USB2 and USB3 devices).
> > > 
> > > Does your libusb driver submit very large transfers?  We ran into an
> > > issue with libusb making assumptions about the short packet behavior
> > > that just isn't true under xHCI.
> > > 
> > > libusb breaks large transfers into smaller buffers because older kernels
> > > had a limit on how large each buffer could be.  Newer kernels now have a
> > > limit on the total size of all mapped usbfs buffers.  So libusb would
> > > break larger transfers into a couple of usbfs URB submissions, and usbfs
> > > would assume that a short packet would stop the queue long enough for it
> > > to cancel any URBs that were part of that transfer.  But that's not true
> > > under xHCI, so your application would see a short transfer completion
> > > followed by a canceled URB completion.
> > > 
> > > Basically, libusb just needs to be fixed to not break transfers into
> > > smaller chunks.  Hans de Geode volunteered to work on this.
> > 
> > Yes, I stumbled into this issue, for the time being we patched libusb,
> > so that MAX_BULK_BUFFER_LENGTH is 1MB (that's exactly how big is the
> > transfers we submit, so no splitting). Without this change it just
> > doesn't work at all (there are some short transfers). Good to hear that
> > somebody is working on it, I submitted my patch to libusb's trac, but
> > had no response yet.
> > 
> > > 
> > > Also, you may need to increase the usbfs total buffer size by reloading
> > > the usbcore:
> > > 
> > > sudo modprobe usbcore usbfs_memory_mb=1000
> > > 
> > > That will increase the usbfs total buffer limit to 1GB.
> > 
> > Is this connected to "No room on EP Ring" messages? I had to limit
> > maximum number of submitted transfers to 3 (so 3 MB in total) to get
> > rid of this message. Bigger values work though on kernels >=3.4, there
> > has been done some dynamic expansion if I understand correctly.
> > 
> > > > xhci driver sometimes
> > > > corrupt memory (probably DMA is to blame), some transfers fail with
> > > > messages like these in kernel log:
> > > > 
> > > > ERROR Transfer event TRB DMA ptr not part of current TD
> > > > WARN Successful completion on short TX
> > > 
> > > Were you running under a Fresco Logic host controller when you got those
> > > messages?  We recently added a quirk for Fresco Logic for successful
> > > completions on short packets.
> > 
> > I think the quirk was enabled, because I even tried adding different
> > quirks myself, made no to little difference.
> > 
> > > 
> > > > WARN: TRB error on endpoint
> > > > WARN waiting for error on ep to be cleared
> > > > 
> > > > I've tried several controllers (Ivy Bridge, Fresco Logic, NEC) and
> > > > different kernels (3.3.3, 3.4.3, 3.5-rc3, xhci git tree), problems
> > > > remain. Last time I tried to enable XHCI verbose debug in kernel it
> > > > hanged the machine after the transfer started. Syslog daemon though was
> > > > able to store at least some of it. So my question is how better to
> > > > proceed with bug reports? What information to provide, running which
> > > > kernel (and controller maybe?..)?
> > > 
> > > Let's try with the NEC host controller, since that's a bit more stable
> > > than the Fresco Logic host.
> > > 
> > > Can you try running with my for-usb-linus branch?  I just fixed a bug
> > > related to successive stalls corrupting the software ring state.
> > > 
> > > Send me the dmesg with xHCI debugging turned off, and then try to use
> > > netconsole to capture the hang with debugging turned on.
> > > 
> > > Sarah Sharp
> > 
> > OK, will do. Maybe usbmon output will be helpful too? Or even better we
> > have USB3 analyzer here, though software seems to be Windows only, so
> > I'm not sure about dump format.
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux