On 3.12.2020 0.59, Ross Zwisler wrote: > On Mon, Nov 23, 2020 at 8:04 AM Mathias Nyman > <mathias.nyman@xxxxxxxxxxxxxxx> wrote: > >> I think I got most of the functionality now working. >> The series is not in upstream shape, but should work, and can be tested. >> just pushed it to a rewrite_halt_stop_handling branch in my tree, ten patches on top of 5.10-rc4 >> >> git://git.kernel.org/pub/scm/linux/kernel/git/mnyman/xhci.git rewrite_halt_stop_handling >> https://git.kernel.org/pub/scm/linux/kernel/git/mnyman/xhci.git/log/?h=rewrite_halt_stop_handling >> >> It still contains dead code that needs to be removed, and all streams (uas) cases are not >> handled properly, it won't pass checkpatch.. and so on, but it should be testable. >> >> Thanks >> -Mathias > > Unfortunately I'm still able to reproduce the failure with your > patches. Here is a dmesg: > > https://gist.github.com/rzwisler/05e52020e87f0ccd6185182be999dae0 > > I was testing at this commit: > > 3c1f3ab219e5f xhci: handle halting transfer event properly after > endpoint stop and halt raced. > > Turning on ftrace makes it much harder to reproduce. Should I keep > trying for a repro with tracing turned on, or is the fact that it > still happens informative enough to know we have to look elsewhere for > a solution? > > Thanks, > - Ross > Ok, thanks. Then the rootcause remains unknown. For some reason the endpoint context dequeue pointer field contains zero instead of the new dequeue pointer. The (output) endpoint context is supposed to be written only by the controller. Time to change strategy and start to detect and treat the symptoms instead. I wrote a patch that detects the 0-dequeue pointer and issues a new Set TR Deq pointer command. Hopefully that works. patch added to same branch, can you try it out? 3f6326766abc xhci: retry setting new dequeue if xHC hardware failed to update it I didn't set a retry limit yet so if it doesn't work it might retry forever. Thanks Mathias