On Thu, Nov 06, 2014 at 06:31:20PM +0200, Mathias Nyman wrote: > On 05.11.2014 21:28, Felipe Balbi wrote: > > Hi, > > > > On Tue, Oct 14, 2014 at 04:34:00PM +0300, Mathias Nyman wrote: > >>>>> Could you try with xhci debugging enabled? (will probably produce a > >>>>> lot of output) > >>>>> > >>>>> echo -n 'module xhci_hcd =p' > /sys/kernel/debug/dynamic_debug/control > >>>> > >>>> I'll try, sure. > >>> > >>> I used tracing otherwise the problem wouldn't show up. Attached you can > >>> find output: > >>> > >>> 0b7e070de7b65de9f70805f4639b3e58 xhci-timeout-testusb.txt.gz > >>> > >> > >> Thanks, looks like we end up calling cleanup_halted_endpoint() a lot. > >> This will (try to) reset the endpoint and move to handle the next TD (URB). > >> > >> This is called when we're processing contorl transfers and something out of the ordinary happends (returned STALL, BABBLE, and some other reasons) > >> > >> I need to dig a bit deeper to know what actually is going on. > > > > any news here ? It's been almost a month. > > > > While looking at this and other bugs I found races between reset endpoint, reset device, and set dequeue pointer commands. > I suspect the loop in your logs is due to starting the endpoint ring too early after reset. It restarts before we move > past the problematic TD, and start executing it again. > > The logs don't show why the TD fails in the first place, but I got another patch fixing other race issues which might help. > > Both patches are now in a "reset-rework" topic branch at: > > git://git.kernel.org/pub/scm/linux/kernel/git/mnyman/xhci.git reset-rework > > Its based on 3.18-rc2. > I haven't still got or set up a usb device with gadget zero to test it out myself I'll try to run it today or tomorrow. -- balbi
Attachment:
signature.asc
Description: Digital signature