Re: xHCI bug

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10.11.2014 17:24, Felipe Balbi wrote:
> Hi,
> 
> On Fri, Nov 07, 2014 at 03:40:01PM +0200, Mathias Nyman wrote:
>> On 07.11.2014 00:25, Felipe Balbi wrote:
>>> On Thu, Nov 06, 2014 at 10:36:30AM -0600, Felipe Balbi wrote:
>>>> On Thu, Nov 06, 2014 at 06:31:20PM +0200, Mathias Nyman wrote:
>>>>> On 05.11.2014 21:28, Felipe Balbi wrote:
>>>>>> Hi,
>>>>>>
>>>>>> On Tue, Oct 14, 2014 at 04:34:00PM +0300, Mathias Nyman wrote:
>>>>>>>>>> Could you try with xhci debugging enabled? (will probably produce a
>>>>>>>>>> lot of output)
>>>>>>>>>>
>>>>>>>>>> echo -n 'module xhci_hcd =p' > /sys/kernel/debug/dynamic_debug/control
>>>>>>>>>
>>>>>>>>> I'll try, sure.
>>>>>>>>
>>>>>>>> I used tracing otherwise the problem wouldn't show up. Attached you can
>>>>>>>> find output:
>>>>>>>>
>>>>>>>> 0b7e070de7b65de9f70805f4639b3e58  xhci-timeout-testusb.txt.gz
>>>>>>>>
>>>>>>>
>>>>>>> Thanks, looks like we end up calling cleanup_halted_endpoint()  a lot.
>>>>>>> This will (try to) reset the endpoint and move to handle the next TD (URB).
>>>>>>>
>>>>>>> This is called when we're processing contorl transfers and something out of the ordinary happends (returned STALL, BABBLE, and some other reasons)
>>>>>>>
>>>>>>> I need to dig a bit deeper to know what actually is going on. 
>>>>>>
>>>>>> any news here ? It's been almost a month.
>>>>>>
>>>>>
>>>>> While looking at this and other bugs I found races between reset endpoint, reset device, and set dequeue pointer commands. 
>>>>> I suspect the loop in your logs is due to starting the endpoint ring too early after reset. It restarts before we move
>>>>> past the problematic TD, and start executing it again.
>>>>>
>>>>> The logs don't show why the TD fails in the first place, but I got another patch fixing other race issues which might help.
>>>>>
>>>>> Both patches are now in a "reset-rework" topic branch at:
>>>>>
>>>>> git://git.kernel.org/pub/scm/linux/kernel/git/mnyman/xhci.git reset-rework
>>>>>
>>>>> Its based on 3.18-rc2.
>>>>> I haven't still got or set up a usb device with gadget zero to test it out myself
>>>>
>>>> I'll try to run it today or tomorrow.
>>>
>>> seems to be working so far. It has been running for at least a couple
>>> hours. I'll leave it running until Monday or Tuesday before giving you a
>>> Tested-by, though.
>>>
>>
>> Thanks, much appreciated. 
>> Sounds promising so far, hope it lasts over the weekend
> 
> Alright, it has been running for almost 4 days and failures so far:
> 
> [1]+ ./test.sh &
> # uptime 
>  15:20:15 up 3 days, 20:08,  1 user,  load average: 1.63, 1.84, 1.86
> 
> So, for both commits on reset-rework (see below), you can have my:
> 
> Tested-by: Felipe Balbi <balbi@xxxxxx>
> 

Thanks alot, this is good news. 

-Mathias

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux