On 6.1.2021 20.52, Ross Zwisler wrote: > On Wed, Dec 30, 2020 at 5:31 AM Mathias Nyman <mathias.nyman@xxxxxxxxxxxxxxx> wrote: > <> >> I was able to reproduce the issue with two USB devices on a different system. >> >> I saw that the new incorrect dequeue pointer sometimes had the higher 32 bits >> incorrect while the lower bits were correct. >> So this looks like a memory access order issue. >> >> The command TRB is 16 bytes. The xhci driver writes it in four 4 byte chunks. >> Even if the driver writes the chunk that sets the cycle bit last, handing the TRB >> over to the controller, it appears that the actual write order can be different. >> The controller ends up reading a command TRB with updated cycle bit but old bogus >> values in the "new dequeue pointer" field. >> >> Adding a write memory barrier before writing the cycle bit solved the issue in my case. >> >> Whole patch series is updated, added write memory barrier, and rebased on 5.10. >> It can be found force-updated in the same rewrite_halt_stop_handling branch: >> >> git://git.kernel.org/pub/scm/linux/kernel/git/mnyman/xhci.git rewrite_halt_stop_handling >> https://git.kernel.org/pub/scm/linux/kernel/git/mnyman/xhci.git/log/?h=rewrite_halt_stop_handling >> >> Does this work for you? > > Yes, it does!! I verified that I'm able to reproduce the issue in less than > 10 seconds with this commit which is the HEAD~1 of your series: > > a7d053d207121 xhci: handle halting transfer event properly after endpoint stop and halt raced. > > With this commit (the final commit in your series, adding a single patch): > > 96887d191a88c xhci: make sure TRB is fully written before giving it to the controller > > I ran cleanly for over 20 minutes and haven't been able to reproduce the > issue. It looks like the wmb() added in that patch makes all the difference! > > Thank you for the fix, and you can add this tag to the series: > > Tested-by: Ross Zwisler <zwisler@xxxxxxxxxx> > Great, thanks for testing. I think it makes sense to cherry pick that one-liner wmb() patch first to 5.11 and stable, then add rest of the rewrite later. -Mathias