On Mon, Oct 14, 2024 at 09:10:05PM +0200, Michal Pecio wrote: > The NEC uPD720200 has a bug, which prevents reliably stopping > an endpoint shortly after it has been restarted. This usually > happens when a driver kills many URBs in quick succession and > it results in concurrent execution and cancellation of TDs. > > This is handled by stopping the endpoint again if in doubt. > > This "doubt" turns out to be a problem, because Stop Endpoint > may be queued when the EP is already Stopped (for Set TR Deq > execution, for example) or becomes Stopped concurrently (by > Reset Endpoint, for example). If the EP is truly Stopped, the > command fails and further retries just keep failing forever. > > This is easily triggered by modifying uvcvideo to unlink its > isochronous URBs in 100us intervals instead of poisoning them. > Any driver that unlinks URBs asynchronously may trigger this, > and any URB unlink during ongoing halt recovery also can. > > Fix the problem by tracking redundant Stop Endpoint commands > which are sure to fail, and by not retrying them. It's easy, > because xhci_urb_dequeue() is the only user ever queuing the > command with the default handler and without ensuring that > the endpoint is Running and will not Halt before it Stops. > For this case, we assume that an endpoint with pending URBs > is always Running, unless certain operations are pending on > it which indicate known exceptions. > > Note that we need to catch those exceptions when they occur, > because their flags may be cleared before our handler runs. > > It's possible that other HCs have similar bugs (see also the > related "Running" case below), but the workaround is limited > to NEC because no such chips are currently known and tested. > > Fixes: fd9d55d190c0 ("xhci: retry Stop Endpoint on buggy NEC controllers") > Signed-off-by: Michal Pecio <michal.pecio@xxxxxxxxx> > --- > drivers/usb/host/xhci-ring.c | 44 +++++++++++++++++++++++++++++++++--- > drivers/usb/host/xhci.h | 2 ++ > 2 files changed, 43 insertions(+), 3 deletions(-) > Hi, This is the friendly patch-bot of Greg Kroah-Hartman. You have sent him a patch that has triggered this response. He used to manually respond to these common problems, but in order to save his sanity (he kept writing the same thing over and over, yet to different people), I was created. Hopefully you will not take offence and will fix the problem in your patch and resubmit it so that it can be accepted into the Linux kernel tree. You are receiving this message because of the following common error(s) as indicated below: - You have marked a patch with a "Fixes:" tag for a commit that is in an older released kernel, yet you do not have a cc: stable line in the signed-off-by area at all, which means that the patch will not be applied to any older kernel releases. To properly fix this, please follow the documented rules in the Documentation/process/stable-kernel-rules.rst file for how to resolve this. If you wish to discuss this problem further, or you have questions about how to resolve this issue, please feel free to respond to this email and Greg will reply once he has dug out from the pending patches received from other developers. thanks, greg k-h's patch email bot