On 10.11.2014 17:24, Felipe Balbi wrote: > Hi, > > On Fri, Nov 07, 2014 at 03:40:01PM +0200, Mathias Nyman wrote: >> On 07.11.2014 00:25, Felipe Balbi wrote: >>> On Thu, Nov 06, 2014 at 10:36:30AM -0600, Felipe Balbi wrote: >>>> On Thu, Nov 06, 2014 at 06:31:20PM +0200, Mathias Nyman wrote: >>>>> On 05.11.2014 21:28, Felipe Balbi wrote: >>>>>> Hi, >>>>>> >>>>>> On Tue, Oct 14, 2014 at 04:34:00PM +0300, Mathias Nyman wrote: >>>>>>>>>> Could you try with xhci debugging enabled? (will probably produce a >>>>>>>>>> lot of output) >>>>>>>>>> >>>>>>>>>> echo -n 'module xhci_hcd =p' > /sys/kernel/debug/dynamic_debug/control >>>>>>>>> >>>>>>>>> I'll try, sure. >>>>>>>> >>>>>>>> I used tracing otherwise the problem wouldn't show up. Attached you can >>>>>>>> find output: >>>>>>>> >>>>>>>> 0b7e070de7b65de9f70805f4639b3e58 xhci-timeout-testusb.txt.gz >>>>>>>> >>>>>>> >>>>>>> Thanks, looks like we end up calling cleanup_halted_endpoint() a lot. >>>>>>> This will (try to) reset the endpoint and move to handle the next TD (URB). >>>>>>> >>>>>>> This is called when we're processing contorl transfers and something out of the ordinary happends (returned STALL, BABBLE, and some other reasons) >>>>>>> >>>>>>> I need to dig a bit deeper to know what actually is going on. >>>>>> >>>>>> any news here ? It's been almost a month. >>>>>> >>>>> >>>>> While looking at this and other bugs I found races between reset endpoint, reset device, and set dequeue pointer commands. >>>>> I suspect the loop in your logs is due to starting the endpoint ring too early after reset. It restarts before we move >>>>> past the problematic TD, and start executing it again. >>>>> >>>>> The logs don't show why the TD fails in the first place, but I got another patch fixing other race issues which might help. >>>>> >>>>> Both patches are now in a "reset-rework" topic branch at: >>>>> >>>>> git://git.kernel.org/pub/scm/linux/kernel/git/mnyman/xhci.git reset-rework >>>>> >>>>> Its based on 3.18-rc2. >>>>> I haven't still got or set up a usb device with gadget zero to test it out myself >>>> >>>> I'll try to run it today or tomorrow. >>> >>> seems to be working so far. It has been running for at least a couple >>> hours. I'll leave it running until Monday or Tuesday before giving you a >>> Tested-by, though. >>> >> >> Thanks, much appreciated. >> Sounds promising so far, hope it lasts over the weekend > > Alright, it has been running for almost 4 days and failures so far: > > [1]+ ./test.sh & > # uptime > 15:20:15 up 3 days, 20:08, 1 user, load average: 1.63, 1.84, 1.86 > > So, for both commits on reset-rework (see below), you can have my: > > Tested-by: Felipe Balbi <balbi@xxxxxx> > Thanks alot, this is good news. -Mathias -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html