On 17.2.2023 16.21, Seth Bollinger wrote:
Hello All, We're experiencing a problem with our devices in the field where our customers attach problematic USB devices that are causing the xhci host controller to shut down with the "HC died; cleaning up" message.
Is this seen only on some specific xHC host controller?
I've narrowed this down to a timeout of the address device TRB on the command ring (currently 5 seconds). It sometimes takes our hardware 9.6 to complete this TRB. When the driver is trying to stop the cmd ring, the controller is busy for an additional 4.6 seconds. This results in the "HC died" message and shutdown of the host controller. If I bump the command ring timeout beyond the max TRB completion time, the host controller continues to be responsive and doesn't need to be shut down. My knowledge of how the usb protocol should handle this problem isn't strong enough to know if there is a better solution than simply increasing the command ring default timeout.
Are these problematic devices USB 2 or USB 3 devices? You could try playing with the Address device command BSR (block set address request) flag and see if helps. Xhci has two ways to get a slot from the Enabled to the Addressed state. option 1: move slot from Enabled state to Addressed in one go: Enabled --(Addr dev cmd, BSR=0)--> Addressed option 2: move from Enabled state via Default state to Addressed state: Enabled --(Addr dev cmd, BSR=1)--> Default --(Addr dev cmd, BSR=0)--> Addressed I think the usb core "old_scheme_first" module parameter will end up doing this. -Mathias